Search results for: sensor data.
7296 Data-Driven Decision-Making in Digital Entrepreneurship
Authors: Abeba Nigussie Turi, Xiangming Samuel Li
Abstract:
Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.
Keywords: Startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8267295 Architecture Integrating Wireless Body Area Networks with Web Services for Ubiquitous Healthcare Service Provisioning
Authors: Ogunduyile O. Oluwgbenga
Abstract:
Recent advancements in sensor technologies and Wireless Body Area Networks (WBANs) have led to the development of cost-effective healthcare devices which can be used to monitor and analyse a person-s physiological parameters from remote locations. These advancements provides a unique opportunity to overcome current healthcare challenges of low quality service provisioning, lack of easy accessibility to service varieties, high costs of services and increasing population of the elderly experienced globally. This paper reports on a prototype implementation of an architecture that seamlessly integrates Wireless Body Area Network (WBAN) with Web services (WS) to proactively collect physiological data of remote patients to recommend diagnostic services. Technologies based upon WBAN and WS can provide ubiquitous accessibility to a variety of services by allowing distributed healthcare resources to be massively reused to provide cost-effective services without individuals physically moving to the locations of those resources. In addition, these technologies can reduce costs of healthcare services by allowing individuals to access services to support their healthcare. The prototype uses WBAN body sensors implemented on arduino fio platforms to be worn by the patient and an android smart phone as a personal server. The physiological data are collected and uploaded through GPRS/internet to the Medical Health Server (MHS) to be analysed. The prototype monitors the activities, location and physiological parameters such as SpO2 and Heart Rate of the elderly and patients in rehabilitation. Medical practitioners would have real time access to the uploaded information through a web application.Keywords: Android Smart phone, Arduino Fio, Web application server, Wireless Body Area Networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25437294 Classifying Bio-Chip Data using an Ant Colony System Algorithm
Authors: Minsoo Lee, Yearn Jeong Kim, Yun-mi Kim, Sujeung Cheong, Sookyung Song
Abstract:
Bio-chips are used for experiments on genes and contain various information such as genes, samples and so on. The two-dimensional bio-chips, in which one axis represent genes and the other represent samples, are widely being used these days. Instead of experimenting with real genes which cost lots of money and much time to get the results, bio-chips are being used for biological experiments. And extracting data from the bio-chips with high accuracy and finding out the patterns or useful information from such data is very important. Bio-chip analysis systems extract data from various kinds of bio-chips and mine the data in order to get useful information. One of the commonly used methods to mine the data is classification. The algorithm that is used to classify the data can be various depending on the data types or number characteristics and so on. Considering that bio-chip data is extremely large, an algorithm that imitates the ecosystem such as the ant algorithm is suitable to use as an algorithm for classification. This paper focuses on finding the classification rules from the bio-chip data using the Ant Colony algorithm which imitates the ecosystem. The developed system takes in consideration the accuracy of the discovered rules when it applies it to the bio-chip data in order to predict the classes.Keywords: Ant Colony System, DNA chip data, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14677293 Trust and Reliability for Public Sector Data
Authors: Klaus Stranacher, Vesna Krnjic, Thomas Zefferer
Abstract:
The public sector holds large amounts of data of various areas such as social affairs, economy, or tourism. Various initiatives such as Open Government Data or the EU Directive on public sector information aim to make these data available for public and private service providers. Requirements for the provision of public sector data are defined by legal and organizational frameworks. Surprisingly, the defined requirements hardly cover security aspects such as integrity or authenticity. In this paper we discuss the importance of these missing requirements and present a concept to assure the integrity and authenticity of provided data based on electronic signatures. We show that our concept is perfectly suitable for the provisioning of unaltered data. We also show that our concept can also be extended to data that needs to be anonymized before provisioning by incorporating redactable signatures. Our proposed concept enhances trust and reliability of provided public sector data.Keywords: Trusted Public Sector Data, Integrity, Authenticity, Reliability, Redactable Signatures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17577292 A New Approach to Signal Processing for DC-Electromagnetic Flowmeters
Authors: Michael Schukat
Abstract:
Electromagnetic flowmeters with DC excitation are used for a wide range of fluid measurement tasks, but are rarely found in dosing applications with short measurement cycles due to the achievable accuracy. This paper will identify a number of factors that influence the accuracy of this sensor type when used for short-term measurements. Based on these results a new signal-processing algorithm will be described that overcomes the identified problems to some extend. This new method allows principally a higher accuracy of electromagnetic flowmeters with DC excitation than traditional methods.
Keywords: Electromagnetic Flowmeter, Kalman Filter, ShortMeasurement Cycles, Signal Estimation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16107291 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance
Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat
Abstract:
Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18317290 Towards Development of Solution for Business Process-Oriented Data Analysis
Authors: M. Klimavicius
Abstract:
This paper proposes a modeling methodology for the development of data analysis solution. The Author introduce the approach to address data warehousing issues at the at enterprise level. The methodology covers the process of the requirements eliciting and analysis stage as well as initial design of data warehouse. The paper reviews extended business process model, which satisfy the needs of data warehouse development. The Author considers that the use of business process models is necessary, as it reflects both enterprise information systems and business functions, which are important for data analysis. The Described approach divides development into three steps with different detailed elaboration of models. The Described approach gives possibility to gather requirements and display them to business users in easy manner.Keywords: Data warehouse, data analysis, business processmanagement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13907289 Preliminary Overview of Data Mining Technology for Knowledge Management System in Institutions of Higher Learning
Authors: Muslihah Wook, Zawiyah M. Yusof, Mohd Zakree Ahmad Nazri
Abstract:
Data mining has been integrated into application systems to enhance the quality of the decision-making process. This study aims to focus on the integration of data mining technology and Knowledge Management System (KMS), due to the ability of data mining technology to create useful knowledge from large volumes of data. Meanwhile, KMS vitally support the creation and use of knowledge. The integration of data mining technology and KMS are popularly used in business for enhancing and sustaining organizational performance. However, there is a lack of studies that applied data mining technology and KMS in the education sector; particularly students- academic performance since this could reflect the IHL performance. Realizing its importance, this study seeks to integrate data mining technology and KMS to promote an effective management of knowledge within IHLs. Several concepts from literature are adapted, for proposing the new integrative data mining technology and KMS framework to an IHL.
Keywords: Data mining, Institutions of Higher Learning, Knowledge Management System, Students' academic performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21417288 Towards a Secure Storage in Cloud Computing
Authors: Mohamed Elkholy, Ahmed Elfatatry
Abstract:
Cloud computing has emerged as a flexible computing paradigm that reshaped the Information Technology map. However, cloud computing brought about a number of security challenges as a result of the physical distribution of computational resources and the limited control that users have over the physical storage. This situation raises many security challenges for data integrity and confidentiality as well as authentication and access control. This work proposes a security mechanism for data integrity that allows a data owner to be aware of any modification that takes place to his data. The data integrity mechanism is integrated with an extended Kerberos authentication that ensures authorized access control. The proposed mechanism protects data confidentiality even if data are stored on an untrusted storage. The proposed mechanism has been evaluated against different types of attacks and proved its efficiency to protect cloud data storage from different malicious attacks.Keywords: Access control, data integrity, data confidentiality, Kerberos authentication, cloud security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17707287 Thailand National Biodiversity Database System with webMathematica and Google Earth
Authors: W. Katsarapong, W. Srisang, K. Jaroensutasinee, M. Jaroensutasinee
Abstract:
National Biodiversity Database System (NBIDS) has been developed for collecting Thai biodiversity data. The goal of this project is to provide advanced tools for querying, analyzing, modeling, and visualizing patterns of species distribution for researchers and scientists. NBIDS data record two types of datasets: biodiversity data and environmental data. Biodiversity data are specie presence data and species status. The attributes of biodiversity data can be further classified into two groups: universal and projectspecific attributes. Universal attributes are attributes that are common to all of the records, e.g. X/Y coordinates, year, and collector name. Project-specific attributes are attributes that are unique to one or a few projects, e.g., flowering stage. Environmental data include atmospheric data, hydrology data, soil data, and land cover data collecting by using GLOBE protocols. We have developed webbased tools for data entry. Google Earth KML and ArcGIS were used as tools for map visualization. webMathematica was used for simple data visualization and also for advanced data analysis and visualization, e.g., spatial interpolation, and statistical analysis. NBIDS will be used by park rangers at Khao Nan National Park, and researchers.Keywords: GLOBE protocol, Biodiversity, Database System, ArcGIS, Google Earth and webMathematica.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19827286 Evaluation of Clustering Based on Preprocessing in Gene Expression Data
Authors: Seo Young Kim, Toshimitsu Hamasaki
Abstract:
Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.
Keywords: Gene expression, clustering, data preprocessing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17397285 Addressing Data Security in the Cloud
Authors: Marinela Mircea
Abstract:
The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.Keywords: cloud computing, data life cycle, data security, security assessment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21607284 A Network Traffic Prediction Algorithm Based On Data Mining Technique
Authors: D. Prangchumpol
Abstract:
This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.
Keywords: Traffic prediction, association rule, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36687283 Fuzzy Processing of Uncertain Data
Authors: Petr Morávek, Miloš Šeda
Abstract:
In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.Keywords: fuzzy logic, linguistic variable, multicriteria decision
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14167282 Automated Stereophotogrammetry Data Cleansing
Authors: Stuart Henry, Philip Morrow, John Winder, Bryan Scotney
Abstract:
The stereophotogrammetry modality is gaining more widespread use in the clinical setting. Registration and visualization of this data, in conjunction with conventional 3D volumetric image modalities, provides virtual human data with textured soft tissue and internal anatomical and structural information. In this investigation computed tomography (CT) and stereophotogrammetry data is acquired from 4 anatomical phantoms and registered using the trimmed iterative closest point (TrICP) algorithm. This paper fully addresses the issue of imaging artifacts around the stereophotogrammetry surface edge using the registered CT data as a reference. Several iterative algorithms are implemented to automatically identify and remove stereophotogrammetry surface edge outliers, improving the overall visualization of the combined stereophotogrammetry and CT data. This paper shows that outliers at the surface edge of stereophotogrammetry data can be successfully removed automatically.
Keywords: Data cleansing, stereophotogrammetry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18427281 Fabrication and Study of Nickel Phthalocyanine based Surface Type Capacitive Sensors
Authors: Mutabar Shah, Muhammad Hassan Sayyad, Khasan S. Karimov
Abstract:
Thin films of Nickel phthalocynine (NiPc) of different thicknesses (100, 150 and 200 nm) were deposited by thermal evaporator on glass substrates with preliminary deposited aluminum electrodes to form Al/NiPc/Al surface-type capacitive humidity sensors. The capacitance-humidity relationships of the sensors were investigated at humidity levels from 35 to 90% RH. It was observed that the capacitance value increases nonlinearly with increasing humidity level. All measurements were taken at room temperature.Keywords: Capacitive sensor, Humidity, Nickel phthalocyanine, Organic semiconductor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17917280 Effect of High-Heeled Shoes on Gait: A Micro-Electro-Mechanical-Systems Based Approach
Authors: Harun Sumbul, Orhan Ozyurt
Abstract:
The accelerations generated by the shoes in the body should be known in order to prevent balance problems, degradation of body shape and to spend less energy. In this study, it is aimed to investigate the effects of the shoe heel height on the human body. The working group has been created as five women (range 27-32 years) with different characteristics and five shoes with different heel heights (1, 3.5, 5, 7 and 9 cm). Individuals in the study group wore shoes and walked along a 20-meter racecourse. The accelerations created by the shoes are measured in three axes (30.270 accelerometric data) and analyzed. Results show us that; while walking with high-heeled shoes, the foot is lifted more; in this case, more effort has been spent. So, more weight has occurred at ankles and joints. Since high-heeled shoes cause greater acceleration, women wearing high-heeled shoes tend to pay more attention when taking a step. As a result, for foot and body health, shoe heel must be designed to absorb the reaction from the ground. High heels disrupt the structure of the foot and it is damaging the body shape. In this respect, this study is considered to be a remarkable method to find of effect of high-heeled shoes on gait by using accelerometer in the literature.
Keywords: Acceleration, sensor, gait analysis, high shoe heel, micro-electro-mechanical-systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9737279 Local Algorithm for Establishing a Virtual Backbone in 3D Ad Hoc Network
Authors: Alaa E. Abdallah, M. Bsoul, Emad E. Abdallah, Ahmad Al-Khasawneh, Muath Alzghool
Abstract:
Due to the limited lifetime of the nodes in ad hoc and sensor networks, energy efficiency needs to be an important design consideration in any routing algorithm. It is known that by employing a virtual backbone in a wireless network, the efficiency of any routing scheme for the network can be improved. One common design for routing protocols in mobile ad hoc networks is to use positioning information; we use the node-s geometric locations to introduce an algorithm that can construct the virtual backbone structure locally in 3D environment. The algorithm construction has a constant time.
Keywords: Virtual backbone, dominating set, UDG.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16787278 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles
Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin
Abstract:
A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15877277 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems
Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan
Abstract:
Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17357276 Association Rules Mining and NOSQL Oriented Document in Big Data
Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub
Abstract:
Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.
Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6937275 Identifying Critical Success Factors for Data Quality Management through a Delphi Study
Authors: Maria Paula Santos, Ana Lucas
Abstract:
Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.
Keywords: Critical success factors, data quality, data quality management, Delphi, Q-Sort.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11067274 Identifying a Drug Addict Person Using Artificial Neural Networks
Authors: Mustafa Al Sukar, Azzam Sleit, Abdullatif Abu-Dalhoum, Bassam Al-Kasasbeh
Abstract:
Use and abuse of drugs by teens is very common and can have dangerous consequences. The drugs contribute to physical and sexual aggression such as assault or rape. Some teenagers regularly use drugs to compensate for depression, anxiety or a lack of positive social skills. Teen resort to smoking should not be minimized because it can be "gateway drugs" for other drugs (marijuana, cocaine, hallucinogens, inhalants, and heroin). The combination of teenagers' curiosity, risk taking behavior, and social pressure make it very difficult to say no. This leads most teenagers to the questions: "Will it hurt to try once?" Nowadays, technological advances are changing our lives very rapidly and adding a lot of technologies that help us to track the risk of drug abuse such as smart phones, Wireless Sensor Networks (WSNs), Internet of Things (IoT), etc. This technique may help us to early discovery of drug abuse in order to prevent an aggravation of the influence of drugs on the abuser. In this paper, we have developed a Decision Support System (DSS) for detecting the drug abuse using Artificial Neural Network (ANN); we used a Multilayer Perceptron (MLP) feed-forward neural network in developing the system. The input layer includes 50 variables while the output layer contains one neuron which indicates whether the person is a drug addict. An iterative process is used to determine the number of hidden layers and the number of neurons in each one. We used multiple experiment models that have been completed with Log-Sigmoid transfer function. Particularly, 10-fold cross validation schemes are used to access the generalization of the proposed system. The experiment results have obtained 98.42% classification accuracy for correct diagnosis in our system. The data had been taken from 184 cases in Jordan according to a set of questions compiled from Specialists, and data have been obtained through the families of drug abusers.
Keywords: Artificial Neural Network, Decision Support System, drug abuse, drug addiction, Multilayer Perceptron.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16797273 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets
Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi
Abstract:
In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.
Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15007272 Peakwise Smoothing of Data Models using Wavelets
Authors: D Sudheer Reddy, N Gopal Reddy, P V Radhadevi, J Saibaba, Geeta Varadan
Abstract:
Smoothing or filtering of data is first preprocessing step for noise suppression in many applications involving data analysis. Moving average is the most popular method of smoothing the data, generalization of this led to the development of Savitzky-Golay filter. Many window smoothing methods were developed by convolving the data with different window functions for different applications; most widely used window functions are Gaussian or Kaiser. Function approximation of the data by polynomial regression or Fourier expansion or wavelet expansion also gives a smoothed data. Wavelets also smooth the data to great extent by thresholding the wavelet coefficients. Almost all smoothing methods destroys the peaks and flatten them when the support of the window is increased. In certain applications it is desirable to retain peaks while smoothing the data as much as possible. In this paper we present a methodology called as peak-wise smoothing that will smooth the data to any desired level without losing the major peak features.Keywords: smoothing, moving average, peakwise smoothing, spatialdensity models, planar shape models, wavelets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17497271 A New Precautionary Method for Measurement and Improvement the Data Quality
Authors: Seyed Mohammad Hossein Moossavizadeh, Mehran Mohsenzadeh, Nasrin Arshadi
Abstract:
the data quality is a kind of complex and unstructured concept, which is concerned by information systems managers. The reason of this attention is the high amount of Expenses for maintenance and cleaning of the inefficient data. Such a data more than its expenses of lack of quality, cause wrong statistics, analysis and decisions in organizations. Therefor the managers intend to improve the quality of their information systems' data. One of the basic subjects of quality improvement is the evaluation of the amount of it. In this paper, we present a precautionary method, which with its application the data of information systems would have a better quality. Our method would cover different dimensions of data quality; therefor it has necessary integrity. The presented method has tested on three dimensions of accuracy, value-added and believability and the results confirm the improvement and integrity of this method.
Keywords: Data quality, precaution, information system, measurement, improvement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14677270 An Efficient Data Mining Approach on Compressed Transactions
Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung
Abstract:
In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.Keywords: Association rule, data mining, merged transaction, quantification table.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19597269 Development of the Gas Safety Management System using an Intelligent Gasmeter with Wireless ZigBee Network
Authors: Gyou-tae Park, Young-gyu Kim, Jeong-rock Kwon, Yongwoo Lee, Hiesik Kim
Abstract:
The gas safety management system using an intelligent gas meter we proposed is to monitor flow and pressure of gas, earthquake, temperature, smoke and leak of methane. Then our system takes safety measures to protect a serious risk by the result of an event, to communicate with a wall-pad including a gateway by zigbee network in buildings and to report the event to user by the safety management program in a server. Also, the inner cutoff valve of an intelligent gas meter is operated if any event occurred or abnormal at each sensor.Keywords: micom gas-meter, gas safety, zigbee, ubiquitous
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19467268 Weigh-in-Motion Data Analysis Software for Developing Traffic Data for Mechanistic Empirical Pavement Design
Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder
Abstract:
Currently, there are few user friendly Weigh-in- Motion (WIM) data analysis softwares available which can produce traffic input data for the recently developed AASHTOWare pavement Mechanistic-Empirical (ME) design software. However, these softwares have only rudimentary Quality Control (QC) processes. Therefore, they cannot properly deal with erroneous WIM data. As the pavement performance is highly sensible to the quality of WIM data, it is highly recommended to use more refined QC process on raw WIM data to get a good result. This study develops a userfriendly software, which can produce traffic input for the ME design software. This software takes the raw data (Class and Weight data) collected from the WIM station and processes it with a sophisticated QC procedure. Traffic data such as traffic volume, traffic distribution, axle load spectra, etc. can be obtained from this software; which can directly be used in the ME design software.Keywords: Weigh-in-motion, software, axle load spectra, traffic distribution, AASHTOWare.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18967267 Human Growth Curve Estimation through a Combination of Longitudinal and Cross-sectional Data
Authors: Sedigheh Mirzaei S., Debasis Sengupta
Abstract:
Parametric models have been quite popular for studying human growth, particularly in relation to biological parameters such as peak size velocity and age at peak size velocity. Longitudinal data are generally considered to be vital for fittinga parametric model to individual-specific data, and for studying the distribution of these biological parameters in a human population. However, cross-sectional data are easier to obtain than longitudinal data. In this paper, we present a method of combining longitudinal and cross-sectional data for the purpose of estimating the distribution of the biological parameters. We demonstrate, through simulations in the special case ofthePreece Baines model, how estimates based on longitudinal data can be improved upon by harnessing the information contained in cross-sectional data.We study the extent of improvement for different mixes of the two types of data, and finally illustrate the use of the method through data collected by the Indian Statistical Institute.Keywords: Preece-Baines growth model, MCMC method, Mixed effect model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138