Search results for: personal data

7295 Real Time Approach for Data Placement in Wireless Sensor Networks

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1771

7294 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1828

7293 A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932

7292 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 763

7291 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1437

7290 Distributed Data-Mining by Probability-Based Patterns

Authors: M. Kargar, F. Gharbalchi

Abstract:

In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.

Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1509

7289 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3241

7288 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1574

7287 Towards Innovation Performance among University Staff

Authors: C. S. Quah, S. P. L. Sim

Abstract:

This study examined how individuals in their respective teams contributed to innovation performance besides defining the term of innovation in their own respective views. This study also identified factors that motivated University staff to contribute to the innovation products. In addition, it examined whether there is a significant relationship between professional training level and the length of service among university staff towards innovation and to what extent do the two variables contributed towards innovative products. The significance of this study is that it revealed the strengths and weaknesses of the university staff when contributing to innovation performance. Stratified-random sampling was employed to determine the samples representing the population of lecturers in the study, involving 123 lecturers in one of the local universities in Malaysia. The method employed to analyze the data is through categorizing into themes for the open-ended questions besides using descriptive and inferential statistics for the quantitative data. This study revealed that two types of definition for the term “innovation” exist among the university staff, namely, creation of new product or new approach to do things as well as value-added creative way to upgrade or improve existing process and service to be more efficient. This study found that the most prominent factor that propels them towards innovation is to improve the product in order to benefit users, followed by selfsatisfaction and recognition. This implies that the staff in the organization viewed the creation of innovative products as a process of growth to fulfill the needs of others and also to realize their personal potential. This study also found that there was only a significant relationship between the professional training level and the length of service of 4 - 6 years among the university staff. The rest of the groups based on the length of service showed that there was no significant relationship with the professional training level towards innovation. Moreover, results of the study on directional measures depicted that the relationship for the length of service of 4- 6 years with professional training level among the university staff is quite weak. This implies that good organization management lies on the shoulders of the key leaders who enlighten the path to be followed by the staff.

Keywords: Innovation, length of service, performance, professional training level, motivation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532

7286 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2504

7285 Modeling of Random Variable with Digital Probability Hyper Digraph: Data-Oriented Approach

Authors: A. Habibizad Navin, M. Naghian Fesharaki, M. Mirnia, M. Kargar

Abstract:

In this paper we introduce Digital Probability Hyper Digraph for modeling random variable as the hierarchical data-oriented model.

Keywords: Data-Oriented Models, Data Structure, DigitalProbability Hyper Digraph, Random Variable, Statistic andProbability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1230

7284 Wireless Transmission of Big Data Using Novel Secure Algorithm

Authors: K. Thiagarajan, K. Saranya, A. Veeraiah, B. Sudha

Abstract:

This paper presents a novel algorithm for secure, reliable and flexible transmission of big data in two hop wireless networks using cooperative jamming scheme. Two hop wireless networks consist of source, relay and destination nodes. Big data has to transmit from source to relay and from relay to destination by deploying security in physical layer. Cooperative jamming scheme determines transmission of big data in more secure manner by protecting it from eavesdroppers and malicious nodes of unknown location. The novel algorithm that ensures secure and energy balance transmission of big data, includes selection of data transmitting region, segmenting the selected region, determining probability ratio for each node (capture node, non-capture and eavesdropper node) in every segment, evaluating the probability using binary based evaluation. If it is secure transmission resume with the two- hop transmission of big data, otherwise prevent the attackers by cooperative jamming scheme and transmit the data in two-hop transmission.

Keywords: Big data, cooperative jamming, energy balance, physical layer, two-hop transmission, wireless security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2144

7283 AHP and Extent Fuzzy AHP Approach for Prioritization of Performance Measurement Attributes

Authors: Remica Aggarwal, Sanjeet Singh

Abstract:

The decision to recruit manpower in an organization requires clear identification of the criteria (attributes) that distinguish successful from unsuccessful performance. The choice of appropriate attributes or criteria in different levels of hierarchy in an organization is a multi-criteria decision problem and therefore multi-criteria decision making (MCDM) techniques can be used for prioritization of such attributes. Analytic Hierarchy Process (AHP) is one such technique that is widely used for deciding among the complex criteria structure in different levels. In real applications, conventional AHP still cannot reflect the human thinking style as precise data concerning human attributes are quite hard to be extracted. Fuzzy logic offers a systematic base in dealing with situations, which are ambiguous or not well defined. This study aims at defining a methodology to improve the quality of prioritization of an employee-s performance measurement attributes under fuzziness. To do so, a methodology based on the Extent Fuzzy Analytic Hierarchy Process is proposed. Within the model, four main attributes such as Subject knowledge and achievements, Research aptitude, Personal qualities and strengths and Management skills with their subattributes are defined. The two approaches conventional AHP approach and the Extent Fuzzy Analytic Hierarchy Process approach have been compared on the same hierarchy structure and criteria set.

Keywords: AHP, Extent analysis method, Fuzzy AHP, Prioritization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4829

7282 Investigating the Individual Difference Antecedents of Perceived Enjoyment in the Acceptance of Blogging

Authors: Yi-Shun Wang, Hsin-Hui Lin, Yi-Wen Liao

Abstract:

With the proliferation of Weblogs (blogs) use in educational contexts, gaining a better understanding of why students are willing to utilize blog systems has become an important topic for practitioners and academics. While perceived enjoyment has been found to have a significant influence on behavioral intentions to use blogs or hedonic systems, few studies have investigated the antecedents of perceived enjoyment in the acceptance of blogging. The main purpose of the present study is to explore the individual difference antecedents of perceived enjoyment and examine how they influence behavioral intention to blog through the mediation of perceived enjoyment. Based on the previous literature, the Big Five personality traits (i.e., extraversion, agreeableness, conscientiousness, neuroticism, and openness to experience), as well as computer self-efficacy and personal innovation in information technology (PIIT), are hypothesized as potential antecedents of perceived enjoyment in the acceptance of blogging. Data collected from 358 respondents in Taiwan are tested against the research model using the structural equation modeling approach. The results indicate that extraversion, agreeableness, conscientiousness, and PIIT have a significant influence on perceived enjoyment, which in turn significantly influences the behavioral intention to blog. These findings lead to several important implications for future research.

Keywords: Individual difference, Big Five personality traits, perceived enjoyment, blogging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069

7281 Implementation of Parallel Interface for Microprocessor Trainer

Authors: Moe Moe Htun, Khin Htar Nwe

Abstract:

In this paper, parallel interface for microprocessor trainer was implemented. A programmable parallel–port device such as the IC 8255A is initialized for simple input or output and for handshake input or output by choosing kinds of modes. The hardware connections and the programs can be used to interface microprocessor trainer and a personal computer by using IC 8255A. The assembly programs edited on PC-s editor can be downloaded to the trainer.

Keywords: Parallel I/O ports, parallel interface, trainer, two 8255 ICs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3120

7280 Challenges and Professional Perspectives for Pedagogy Undergraduates with Specific Learning Disability: A Greek Case Study

Authors: Tatiani D. Mousoura

Abstract:

Specific learning disability (SLD) in higher education has been partially explored in Greece so far. Moreover, opinions on professional perspectives for university students with SLD, is scarcely encountered in Greek research. The perceptions of the hidden character of SLD along with the university policy towards it and professional perspectives that result from this policy have been examined in the present research. This study has applied the paradigm of a Greek Tertiary Pedagogical Education Department (Early Childhood Education). Via mixed methods, data have been collected from different groups of people in the Pedagogical Department: students with SLD and without SLD, academic staff and administration staff, all of which offer the opportunity for triangulation of the findings. Qualitative methods include ten interviews with students with SLD and 15 interviews with academic staff and 60 hours of observation of the students with SLD. Quantitative methods include 165 questionnaires completed by third and fourth-year students and five questionnaires completed by the administration staff. Thematic analyses of the interviews’ data and descriptive statistics on the questionnaires’ data have been applied for the processing of the results. The use of medical terms to define and understand SLD was common in the student cohort, regardless of them having an SLD diagnosis. However, this medical model approach is far more dominant in the group of students without SLD who, by majority, hold misconceptions on a definitional level. The academic staff group seems to be leaning towards a social approach concerning SLD. According to them, diagnoses may lead to social exclusion. The Pedagogical Department generally endorses the principles of inclusion and complies with the provision of oral exams for students with SLD. Nevertheless, in practice, there seems to be a lack of regular academic support for these students. When such support does exist, it is only through individual initiatives. With regards to their prospective profession, students with SLD can utilize their personal experience, as well as their empathy; these appear to be unique weapons in their hands –in comparison with other educators− when it comes to teaching students in the future. In the Department of Pedagogy, provision towards SLD results sporadic, however the vision of an inclusive department does exist. Based on their studies and their experience, pedagogy students with SLD claim that they have an experiential internalized advantage for their future career as educators.

Keywords: Specific learning disability, dyslexia, pedagogy department, inclusion, professional role of SLDed educators, higher education, university policy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 952

7279 Study of Efficiency and Capability LZW++ Technique in Data Compression

Authors: Yusof. Mohd Kamir, Mat Deris. Mohd Sufian, Abidin. Ahmad Faisal Amri

Abstract:

The purpose of this paper is to show efficiency and capability LZWµ in data compression. The LZWµ technique is enhancement from existing LZW technique. The modification the existing LZW is needed to produce LZWµ technique. LZW read one by one character at one time. Differ with LZWµ technique, where the LZWµ read three characters at one time. This paper focuses on data compression and tested efficiency and capability LZWµ by different data format such as doc type, pdf type and text type. Several experiments have been done by different types of data format. The results shows LZWµ technique is better compared to existing LZW technique in term of file size.

Keywords: Data Compression, Huffman Encoding, LZW, LZWµ, RLL, Size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2039

7278 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: Hit rate, Locality of program, Stack cache, and Stack data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466

7277 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2477

7276 Transforming Health Information from Manual to Digital (Electronic) World–Reference and Guide

Authors: S. Karthikeyan, Naveen Bindra

Abstract:

Introduction: To update ourselves and understand the concept of latest electronic formats available for Health care providers and how it could be used and developed as per standards. The idea is to correlate between the patients Manual Medical Records keeping and maintaining patients Electronic Information in a Health care setup in this world. Furthermore, this stands with adapting to the right technology depending upon the organization and improve our quality and quantity of Healthcare providing skills. Objective: The concept and theory is to explain the terms of Electronic Medical Record (EMR), Electronic Health Record (EHR) and Personal Health Record (PHR) and selecting the best technical among the available Electronic sources and software before implementing. It is to guide and make sure the technology used by the end users without any doubts and difficulties. The idea is to evaluate is to admire the uses and barriers of EMR-EHR-PHR. Aim and Scope: The target is to achieve the health care providers like Physicians, Nurses, Therapists, Medical Bill reimbursements, Insurances and Government to assess the patient’s information on easy and systematic manner without diluting the confidentiality of patient’s information. Method: Health Information Technology can be implemented with the help of Organisations providing with legal guidelines and help to stand by the health care provider. The main objective is to select the correct embedded and affordable database management software and generating large-scale data. The parallel need is to know how the latest software available in the market. Conclusion: The question lies here is implementing the Electronic information system with healthcare providers and organization. The clinicians are the main users of the technology and manage us to “go paperless”. The fact is that day today changing technologically is very sound and up to date. Basically, the idea is to tell how to store the data electronically safe and secure. All three exemplifies the fact that an electronic format has its own benefit as well as barriers.

Keywords: Medical records, digital records, health information, electronic record system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1307

7275 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.

Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057

7274 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: Data Mining, Environmental Modeling, Sustainability, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730

7273 An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Authors: Minsoo Lee, Yun-mi Kim, Yearn Jeong Kim, Yoon-kyung Lee, Hyejung Yoon

Abstract:

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Keywords: Ant colony system, biological data, clustering, DNA chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932

7272 The Resource Description Framework (RDF) as a Modern Structure for Medical Data

Authors: Gabriela Lindemann, Danilo Schmidt, Thomas Schrader, Dietmar Keune

Abstract:

The amount and heterogeneity of data in biomedical research, notably in interdisciplinary fields, requires new methods for the collection, presentation and analysis of information. Important data from laboratory experiments as well as patient trials are available but come out of distributed resources. The Charité - University Hospital Berlin has established together with the German Research Foundation (DFG) a new information service centre for kidney diseases and transplantation (Open European Nephrology Science Centre - OpEN.SC). Beside a collaborative aspect to create new research groups every single partner or institution of this science information centre making his own data available is allowed to search the whole data pool of the various involved centres. A core task is the implementation of a non-restricting open data structure for the various different data sources. We decided to use a modern RDF model and in a first phase transformed original data coming from the web-based Electronic Patient Record database TBase©.

Keywords: Medical databases, Resource Description Framework (RDF), metadata repository.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986

7271 XML Data Management in Compressed Relational Database

Authors: Hongzhi Wang, Jianzhong Li, Hong Gao

Abstract:

XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..

Keywords: XML, compression, query processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753

7270 Privacy in New Mobile Payment Protocol

Authors: Tan Soo Fun, Leau Yu Beng, Rozaini Roslan, Habeeb Saleh Habeeb

Abstract:

The increasing development of wireless networks and the widespread popularity of handheld devices such as Personal Digital Assistants (PDAs), mobile phones and wireless tablets represents an incredible opportunity to enable mobile devices as a universal payment method, involving daily financial transactions. Unfortunately, some issues hampering the widespread acceptance of mobile payment such as accountability properties, privacy protection, limitation of wireless network and mobile device. Recently, many public-key cryptography based mobile payment protocol have been proposed. However, limited capabilities of mobile devices and wireless networks make these protocols are unsuitable for mobile network. Moreover, these protocols were designed to preserve traditional flow of payment data, which is vulnerable to attack and increase the user-s risk. In this paper, we propose a private mobile payment protocol which based on client centric model and by employing symmetric key operations. The proposed mobile payment protocol not only minimizes the computational operations and communication passes between the engaging parties, but also achieves a completely privacy protection for the payer. The future work will concentrate on improving the verification solution to support mobile user authentication and authorization for mobile payment transactions.

Keywords: Mobile Network Operator, Mobile payment protocol, Privacy, Symmetric key.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089

7269 A System for Analyzing and Eliciting Public Grievances Using Cache Enabled Big Data

Authors: P. Kaladevi, N. Giridharan

Abstract:

The system for analyzing and eliciting public grievances serves its main purpose to receive and process all sorts of complaints from the public and respond to users. Due to the more number of complaint data becomes big data which is difficult to store and process. The proposed system uses HDFS to store the big data and uses MapReduce to process the big data. The concept of cache was applied in the system to provide immediate response and timely action using big data analytics. Cache enabled big data increases the response time of the system. The unstructured data provided by the users are efficiently handled through map reduce algorithm. The processing of complaints takes place in the order of the hierarchy of the authority. The drawbacks of the traditional database system used in the existing system are set forth by our system by using Cache enabled Hadoop Distributed File System. MapReduce framework codes have the possible to leak the sensitive data through computation process. We propose a system that add noise to the output of the reduce phase to avoid signaling the presence of sensitive data. If the complaints are not processed in the ample time, then automatically it is forwarded to the higher authority. Hence it ensures assurance in processing. A copy of the filed complaint is sent as a digitally signed PDF document to the user mail id which serves as a proof. The system report serves to be an essential data while making important decisions based on legislation.

Keywords: Big Data, Hadoop, HDFS, Caching, MapReduce, web personalization, e-governance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548

7268 Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with K-Modes and K-representative, which show that the new measure generates clusters with high purity.

Keywords: Clustering, categorical data, K-Modes, weighted dissimilarity measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3638

7267 Toward Delegated Democracy: Vote by Yourself, or Trust Your Network

Authors: Hiroshi Yamakawa, Michiko Yoshida, Motohiro Tsuchiya

Abstract:

The recent development of Information and Communication Technology (ICT) enables new ways of "democratic" decision-making such as a page-ranking system, which estimates the importance of a web page based on indirect trust on that page shared by diverse group of unorganized individuals. These kinds of "democracy" have not been acclaimed yet in the world of real politics. On the other hand, a large amount of data about personal relations including trust, norms of reciprocity, and networks of civic engagement has been accumulated in a computer-readable form by computer systems (e.g., social networking systems). We can use these relations as a new type of social capital to construct a new democratic decision-making system based on a delegation network. In this paper, we propose an effective decision-making support system, which is based on empowering someone's vote whom you trust. For this purpose, we propose two new techniques: the first is for estimating entire vote distribution from a small number of votes, and the second is for estimating active voter choice to promote voting using a delegation network. We show that these techniques could increase the voting ratio and credibility of the whole decision by agent-based simulations.

Keywords: Delegation, network centrality, social network, voting ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735

7266 Mobile Phone as a Tool for Data Collection in Field Research

Authors: Sandro Mourão, Karla Okada

Abstract:

The necessity of accurate and timely field data is shared among organizations engaged in fundamentally different activities, public services or commercial operations. Basically, there are three major components in the process of the qualitative research: data collection, interpretation and organization of data, and analytic process. Representative technological advancements in terms of innovation have been made in mobile devices (mobile phone, PDA-s, tablets, laptops, etc). Resources that can be potentially applied on the data collection activity for field researches in order to improve this process. This paper presents and discuss the main features of a mobile phone based solution for field data collection, composed of basically three modules: a survey editor, a server web application and a client mobile application. The data gathering process begins with the survey creation module, which enables the production of tailored questionnaires. The field workforce receives the questionnaire(s) on their mobile phones to collect the interviews responses and sending them back to a server for immediate analysis.

Keywords: Data Gathering, Field Research, Mobile Phone, Survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007