Search results for: learning using labeled and unlabelled data
6880 Dynamical Analysis of Circadian Gene Expression
Authors: Carla Layana Luis Diambra
Abstract:
Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.
Keywords: circadian rhythms, clustering, gene expression, PCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15996879 Modified Data Mining Approach for Defective Diagnosis in Hard Disk Drive Industry
Authors: S. Soommat, S. Patamatamkul, T. Prempridi, M. Sritulyachot, P. Ineure, S. Yimman
Abstract:
Currently, slider process of Hard Disk Drive Industry become more complex, defective diagnosis for yield improvement becomes more complicated and time-consumed. Manufacturing data analysis with data mining approach is widely used for solving that problem. The existing mining approach from combining of the KMean clustering, the machine oriented Kruskal-Wallis test and the multivariate chart were applied for defective diagnosis but it is still be a semiautomatic diagnosis system. This article aims to modify an algorithm to support an automatic decision for the existing approach. Based on the research framework, the new approach can do an automatic diagnosis and help engineer to find out the defective factors faster than the existing approach about 50%.Keywords: Slider process, Defective diagnosis and Data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12026878 Authentication and Data Hiding Using a Reversible ROI-based Watermarking Scheme for DICOM Images
Authors: Osamah M. Al-Qershi, Khoo Bee Ee
Abstract:
In recent years image watermarking has become an important research area in data security, confidentiality and image integrity. Many watermarking techniques were proposed for medical images. However, medical images, unlike most of images, require extreme care when embedding additional data within them because the additional information must not affect the image quality and readability. Also the medical records, electronic or not, are linked to the medical secrecy, for that reason, the records must be confidential. To fulfill those requirements, this paper presents a lossless watermarking scheme for DICOM images. The proposed a fragile scheme combines two reversible techniques based on difference expansion for patient's data hiding and protecting the region of interest (ROI) with tamper detection and recovery capability. Patient's data are embedded into ROI, while recovery data are embedded into region of non-interest (RONI). The experimental results show that the original image can be exactly extracted from the watermarked one in case of no tampering. In case of tampered ROI, tampered area can be localized and recovered with a high quality version of the original area.Keywords: DICOM, reversible, ROI-based, watermarking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17306877 A Comprehensive Survey on Machine Learning Techniques and User Authentication Approaches for Credit Card Fraud Detection
Authors: Niloofar Yousefi, Marie Alaghband, Ivan Garibay
Abstract:
With the increase of credit card usage, the volume of credit card misuse also has significantly increased, which may cause appreciable financial losses for both credit card holders and financial organizations issuing credit cards. As a result, financial organizations are working hard on developing and deploying credit card fraud detection methods, in order to adapt to ever-evolving, increasingly sophisticated defrauding strategies and identifying illicit transactions as quickly as possible to protect themselves and their customers. Compounding on the complex nature of such adverse strategies, credit card fraudulent activities are rare events compared to the number of legitimate transactions. Hence, the challenge to develop fraud detection that are accurate and efficient is substantially intensified and, as a consequence, credit card fraud detection has lately become a very active area of research. In this work, we provide a survey of current techniques most relevant to the problem of credit card fraud detection. We carry out our survey in two main parts. In the first part, we focus on studies utilizing classical machine learning models, which mostly employ traditional transnational features to make fraud predictions. These models typically rely on some static physical characteristics, such as what the user knows (knowledge-based method), or what he/she has access to (object-based method). In the second part of our survey, we review more advanced techniques of user authentication, which use behavioral biometrics to identify an individual based on his/her unique behavior while he/she is interacting with his/her electronic devices. These approaches rely on how people behave (instead of what they do), which cannot be easily forged. By providing an overview of current approaches and the results reported in the literature, this survey aims to drive the future research agenda for the community in order to develop more accurate, reliable and scalable models of credit card fraud detection.
Keywords: credit card fraud detection, user authentication, behavioral biometrics, machine learning, literature survey
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5606876 Stego Machine – Video Steganography using Modified LSB Algorithm
Authors: Mritha Ramalingam
Abstract:
Computer technology and the Internet have made a breakthrough in the existence of data communication. This has opened a whole new way of implementing steganography to ensure secure data transfer. Steganography is the fine art of hiding the information. Hiding the message in the carrier file enables the deniability of the existence of any message at all. This paper designs a stego machine to develop a steganographic application to hide data containing text in a computer video file and to retrieve the hidden information. This can be designed by embedding text file in a video file in such away that the video does not loose its functionality using Least Significant Bit (LSB) modification method. This method applies imperceptible modifications. This proposed method strives for high security to an eavesdropper-s inability to detect hidden information.Keywords: Data hiding, LSB, Stego machine, VideoSteganography
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 42806875 Data Projects for “Social Good”: Challenges and Opportunities
Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood
Abstract:
One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.Keywords: Data-Driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18136874 Lean Manufacturing: Systematic Layout Planning Application to an Assembly Line Layout of a Welding Industry
Authors: Fernando Augusto Ullmann Tobe, Moacyr Amaral Domingues, Figueiredo, Stephany Rie Yamamoto Gushiken
Abstract:
The purpose of this paper is to present the process of elaborating the layout of an assembly line of a welding industry using the principles of lean manufacturing as the main driver. The objective of this paper is relevant since the current layout of the assembly line causes non-productive times for operators, being related to the lean waste of unnecessary movements. The methodology used for the project development was Project-based Learning (PBL), which is an active way of learning focused on real problems. The process of selecting the methodology for layout planning was developed considering three criteria to evaluate the most relevant one for this paper's goal. As a result of this evaluation, Systematic Layout Planning was selected, and three steps were added to it – Value Stream Mapping for the current situation and after layout changed and the definition of lean tools and layout type. This inclusion was to consider lean manufacturing in the layout redesign of the industry. The layout change resulted in an increase in the value-adding time of operations carried out in the sector, reduction in movement times between previous and final assemblies, and in cost savings regarding the man-hour value of the employees, which can be invested in productive hours instead of movement times.
Keywords: Assembly line, layout, lean manufacturing, systematic layout planning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8406873 Multi-Enterprise Tie and Co-Operation Mechanism in Mexican Agro Industry SME's
Authors: Tania Elena González Alvarado, Ma. Antonieta Martín Granados
Abstract:
The aim of this paper is to explain what a multienterprise tie is, what evidence its analysis provides and how does the cooperation mechanism influence the establishment of a multienterprise tie. The study focuses on businesses of smaller dimension, geographically dispersed and whose businessmen are learning to cooperate in an international environment. The empirical evidence obtained at this moment permits to conclude the following: The tie is not long-lasting, it has an end; opportunism is an opportunity to learn; the multi-enterprise tie is a space to learn about the cooperation mechanism; the local tie permits a businessman to alternate between competition and cooperation strategies; the disappearance of a tie is an experience of learning for a businessman, diminishing the possibility of failure in the next tie; the cooperation mechanism tends to eliminate hierarchical relations; the multienterprise tie diminishes the asymmetries and permits SME-s to have a better position when they negotiate with large companies; the multi-enterprise tie impacts positively on the local system. The collection of empirical evidence was done trough the following instruments: direct observation in a business encounter to which the businesses attended in 2003 (202 Mexican agro industry SME-s), a survey applied in 2004 (129), a questionnaire applied in 2005 (86 businesses), field visits to the businesses during the period 2006-2008 and; a survey applied by telephone in 2008 (55 Mexican agro industry SME-s).
Keywords: Cooperation, multi-enterprise tie, links, networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12796872 Auto Classification for Search Intelligence
Authors: Lilac A. E. Al-Safadi
Abstract:
This paper proposes an auto-classification algorithm of Web pages using Data mining techniques. We consider the problem of discovering association rules between terms in a set of Web pages belonging to a category in a search engine database, and present an auto-classification algorithm for solving this problem that are fundamentally based on Apriori algorithm. The proposed technique has two phases. The first phase is a training phase where human experts determines the categories of different Web pages, and the supervised Data mining algorithm will combine these categories with appropriate weighted index terms according to the highest supported rules among the most frequent words. The second phase is the categorization phase where a web crawler will crawl through the World Wide Web to build a database categorized according to the result of the data mining approach. This database contains URLs and their categories.Keywords: Information Processing on the Web, Data Mining, Document Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16236871 Effects of Gamification on Lower Secondary School Students’ Motivation and Engagement
Authors: Goh Yung Hong, Mona Masood
Abstract:
This paper explores the effects of gamification on lower secondary school students’ motivation and engagement in the classroom. Two-group posttest-only experimental design were employed to study the influence of gamification teaching method (GTM) when compared with conventional teaching method (CTM) on 60 lower secondary school students. The Student Engagement Instrument (SEI) and Intrinsic Motivation Inventory (IMI) were used to assess students’ intrinsic motivation and engagement level towards the respective teaching method. Finding indicates that students who completed the GTM lesson were significantly higher in intrinsic motivation to learn than those from the CTM. Although the result were insignificant and only marginal difference in the engagement mean, GTM still show better potential in raising student’s engagement in class when compared with CTM. This finding proves that the GTM is likely to solve the current issue of low motivation to learn and low engagement in class among lower secondary school students in Malaysia. On the other hand, despite being not significant, higher mean indicates that CTM positively contribute to higher peer support for learning and better teacher and student relationship when compared with GTM. As a conclusion, gamification approach is flexible and can be adapted into many learning content to enhance the intrinsic motivation to learn and to some extent, encourage better student engagement in class.
Keywords: Conventional teaching method, Gamification teaching method, Motivation, Engagement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 58306870 Retrieval of Relevant Visual Data in Selected Machine Vision Tasks: Examples of Hardware-based and Software-based Solutions
Authors: Andrzej Śluzek
Abstract:
To illustrate diversity of methods used to extract relevant (where the concept of relevance can be differently defined for different applications) visual data, the paper discusses three groups of such methods. They have been selected from a range of alternatives to highlight how hardware and software tools can be complementarily used in order to achieve various functionalities in case of different specifications of “relevant data". First, principles of gated imaging are presented (where relevance is determined by the range). The second methodology is intended for intelligent intrusion detection, while the last one is used for content-based image matching and retrieval. All methods have been developed within projects supervised by the author.
Keywords: Relevant visual data, gated imaging, intrusion detection, image matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13986869 SOA-Based Mobile Application for Crime Control in Thailand
Authors: Jintana Khemprasit, Vatcharaporn Esichaikul
Abstract:
Crime is a major societal problem for most of the world's nations. Consequently, the police need to develop new methods to improve their efficiency in dealing with these ever increasing crime rates. Two of the common difficulties that the police face in crime control are crime investigation and the provision of crime information to the general public to help them protect themselves. Crime control in police operations involves the use of spatial data, crime data and the related crime data from different organizations (depending on the nature of the analysis to be made). These types of data are collected from several heterogeneous sources in different formats and from different platforms, resulting in a lack of standardization. Moreover, there is no standard framework for crime data collection, integration and dissemination through mobile devices. An investigation into the current situation in crime control was carried out to identify the needs to resolve these issues. This paper proposes and investigates the use of service oriented architecture (SOA) and the mobile spatial information service in crime control. SOA plays an important role in crime control as an appropriate way to support data exchange and model sharing from heterogeneous sources. Crime control also needs to facilitate mobile spatial information services in order to exchange, receive, share and release information based on location to mobile users anytime and anywhere.Keywords: Crime Control, Geographic Information System (GIS), Mobile GIS, Service Oriented Architecture (SOA).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25416868 Satisfaction of Distance Education University Students with the Use of Audio Media as a Medium of Instruction: The Case of Mountains of the Moon University in Uganda
Authors: Mark Kaahwa, Chang Zhu, Moses Muhumuza
Abstract:
This study investigates the satisfaction of distance education university students (DEUS) with the use of audio media as a medium of instruction. Studying students’ satisfaction is vital because it shows whether learners are comfortable with a certain instructional strategy or not. Although previous studies have investigated the use of audio media, the satisfaction of students with an instructional strategy that combines radio teaching and podcasts as an independent teaching strategy has not been fully investigated. In this study, all lectures were delivered through the radio and students had no direct contact with their instructors. No modules or any other material in form of text were given to the students. They instead, revised the taught content by listening to podcasts saved on their mobile electronic gadgets. Prior to data collection, DEUS received orientation through workshops on how to use audio media in distance education. To achieve objectives of the study, a survey, naturalistic observations and face-to-face interviews were used to collect data from a sample of 211 undergraduate and graduate students. Findings indicate that there was no statistically significant difference in the levels of satisfaction between male and female students. The results from post hoc analysis show that there is a statistically significant difference in the levels of satisfaction regarding the use of audio media between diploma and graduate students. Diploma students are more satisfied compared to their graduate counterparts. T-test results reveal that there was no statistically significant difference in the general satisfaction with audio media between rural and urban-based students. And ANOVA results indicate that there is no statistically significant difference in the levels of satisfaction with the use of audio media across age groups. Furthermore, results from observations and interviews reveal that DEUS found learning using audio media a pleasurable medium of instruction. This is an indication that audio media can be considered as an instructional strategy on its own merit.
Keywords: Audio media, distance education, distance education university students, medium of instruction, satisfaction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8106867 Multidimensional and Data Mining Analysis for Property Investment Risk Analysis
Authors: Nur Atiqah Rochin Demong, Jie Lu, Farookh Khadeer Hussain
Abstract:
Property investment in the real estate industry has a high risk due to the uncertainty factors that will affect the decisions made and high cost. Analytic hierarchy process has existed for some time in which referred to an expert-s opinion to measure the uncertainty of the risk factors for the risk analysis. Therefore, different level of experts- experiences will create different opinion and lead to the conflict among the experts in the field. The objective of this paper is to propose a new technique to measure the uncertainty of the risk factors based on multidimensional data model and data mining techniques as deterministic approach. The propose technique consist of a basic framework which includes four modules: user, technology, end-user access tools and applications. The property investment risk analysis defines as a micro level analysis as the features of the property will be considered in the analysis in this paper.Keywords: Uncertainty factors, data mining, multidimensional data model, risk analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29316866 Computational Aspects of Regression Analysis of Interval Data
Authors: Michal Cerny
Abstract:
We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.
Keywords: Linear regression, interval-censored data, computational complexity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14786865 Time-Derivative Estimation of Noisy Movie Data using Adaptive Control Theory
Authors: Soon-Hyun Park, Takami Matsuo
Abstract:
This paper presents an adaptive differentiator of sequential data based on the adaptive control theory. The algorithm is applied to detect moving objects by estimating a temporal gradient of sequential data at a specified pixel. We adopt two nonlinear intensity functions to reduce the influence of noises. The derivatives of the nonlinear intensity functions are estimated by an adaptive observer with σ-modification update law.Keywords: Adaptive estimation, parameter adjustmentlaw, motion detection, temporal gradient, differential filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18846864 Transformation of the Business Model in an Occupational Health Care Company Embedded in an Emerging Personal Data Ecosystem: A Case Study in Finland
Authors: Tero Huhtala, Minna Pikkarainen, Saila Saraniemi
Abstract:
Information technology has long been used as an enabler of exchange for goods and services. Services are evolving from generic to personalized, and the reverse use of customer data has been discussed in both academia and industry for the past few years. This article presents the results of an empirical case study in the area of preventive health care services. The primary data were gathered in workshops, in which future personal data-based services were conceptualized by analyzing future scenarios from a business perspective. The aim of this study is to understand business model transformation in emerging personal data ecosystems. The work was done as a case study in the context of occupational healthcare. The results have implications to theory and practice, indicating that adopting personal data management principles requires transformation of the business model, which, if successfully managed, may provide access to more resources, potential to offer better value, and additional customer channels. These advantages correlate with the broadening of the business ecosystem. Expanding the scope of this study to include more actors would improve the validity of the research. The results draw from existing literature and are based on findings from a case study and the economic properties of the healthcare industry in Finland.
Keywords: Ecosystem, business model, personal data, preventive healthcare.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11546863 Beam and Diffuse Solar Energy in Zarqa City
Authors: Ali M. Jawarneh
Abstract:
Beam and diffuse radiation data are extracted analytically from previous measured data on a horizontal surface in Zarqa city. Moreover, radiation data on a tilted surfaces with different slopes have been derived and analyzed. These data are consisting of of beam contribution, diffuse contribution, and ground reflected contribution radiation. Hourly radiation data for horizontal surface possess the highest radiation values on June, and then the values decay as the slope increases and the sharp decreasing happened for vertical surface. The beam radiation on a horizontal surface owns the highest values comparing to diffuse radiation for all days of June. The total daily radiation on the tilted surface decreases with slopes. The beam radiation data also decays with slopes especially for vertical surface. Diffuse radiation slightly decreases with slopes with sharp decreases for vertical surface. The groundreflected radiation grows with slopes especially for vertical surface. It-s clear that in June the highest harvesting of solar energy occurred for horizontal surface, then the harvesting decreases as the slope increases.
Keywords: Beam and Diffuse Radiation, Zarqa City
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15626862 An Application of the Data Mining Methods with Decision Rule
Authors: Xun Ge, Jianhua Gong
Abstract:
ankings for output of Chinese main agricultural commodity in the world for 1978, 1980, 1990, 2000, 2006, 2007 and 2008 have been released in United Nations FAO Database. Unfortunately, where the ranking of output of Chinese cotton lint in the world for 2008 was missed. This paper uses sequential data mining methods with decision rules filling this gap. This new data mining method will be help to give a further improvement for United Nations FAO Database.
Keywords: Ranking, output of the main agricultural commodity, gross domestic product, decision table, information system, data mining, decision rule
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17146861 LiDAR Based Real Time Multiple Vehicle Detection and Tracking
Authors: Zhongzhen Luo, Saeid Habibi, Martin v. Mohrenschildt
Abstract:
Self-driving vehicle require a high level of situational awareness in order to maneuver safely when driving in real world condition. This paper presents a LiDAR based real time perception system that is able to process sensor raw data for multiple target detection and tracking in dynamic environment. The proposed algorithm is nonparametric and deterministic that is no assumptions and priori knowledge are needed from the input data and no initializations are required. Additionally, the proposed method is working on the three-dimensional data directly generated by LiDAR while not scarifying the rich information contained in the domain of 3D. Moreover, a fast and efficient for real time clustering algorithm is applied based on a radially bounded nearest neighbor (RBNN). Hungarian algorithm procedure and adaptive Kalman filtering are used for data association and tracking algorithm. The proposed algorithm is able to run in real time with average run time of 70ms per frame.Keywords: LiDAR, real-time system, clustering, tracking, data association.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 46876860 Watermark Bit Rate in Diverse Signal Domains
Authors: Nedeljko Cvejic, Tapio Sepp
Abstract:
A study of the obtainable watermark data rate for information hiding algorithms is presented in this paper. As the perceptual entropy for wideband monophonic audio signals is in the range of four to five bits per sample, a significant amount of additional information can be inserted into signal without causing any perceptual distortion. Experimental results showed that transform domain watermark embedding outperforms considerably watermark embedding in time domain and that signal decompositions with a high gain of transform coding, like the wavelet transform, are the most suitable for high data rate information hiding. Keywords?Digital watermarking, information hiding, audio watermarking, watermark data rate.
Keywords: Digital watermarking, information hiding, audio watermarking, watermark data rate.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16316859 Concurrent Access to Complex Entities
Authors: Cosmin Rablou
Abstract:
In this paper we present a way of controlling the concurrent access to data in a distributed application using the Pessimistic Offline Lock design pattern. In our case, the application processes a complex entity, which contains in a hierarchical structure different other entities (objects). It will be shown how the complex entity and the contained entities must be locked in order to control the concurrent access to data.Keywords: Object-oriented programming, Pessimistic Lock, Design pattern, Concurrent access to data, Processing complex entities
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13196858 A Remote Sensing Approach to Calculate Population Using Roads Network Data in Lebanon
Authors: Kamel Allaw, Jocelyne Adjizian Gerard, Makram Chehayeb, Nada Badaro Saliba
Abstract:
In developing countries, such as Lebanon, the demographic data are hardly available due to the absence of the mechanization of population system. The aim of this study is to evaluate, using only remote sensing data, the correlations between the number of population and the characteristics of roads network (length of primary roads, length of secondary roads, total length of roads, density and percentage of roads and the number of intersections). In order to find the influence of the different factors on the demographic data, we studied the degree of correlation between each factor and the number of population. The results of this study have shown a strong correlation between the number of population and the density of roads and the number of intersections.
Keywords: Population, road network, statistical correlations, remote sensing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10036857 Risk-Management by Numerical Pattern Analysis in Data-Mining
Authors: M. Kargar, R. Mirmiran, F. Fartash, T. Saderi
Abstract:
In this paper a new method is suggested for risk management by the numerical patterns in data-mining. These patterns are designed using probability rules in decision trees and are cared to be valid, novel, useful and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. The patterns are analyzed through the produced matrices and some results are pointed out. By using the suggested method the direction of the functionality route in the systems can be controlled and best planning for special objectives be done.Keywords: Analysis, Data-mining, Pattern, Risk Management.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12786856 Wind Speed Data Analysis using Wavelet Transform
Authors: S. Avdakovic, A. Lukac, A. Nuhanovic, M. Music
Abstract:
Renewable energy systems are becoming a topic of great interest and investment in the world. In recent years wind power generation has experienced a very fast development in the whole world. For planning and successful implementations of good wind power plant projects, wind potential measurements are required. In these projects, of great importance is the effective choice of the micro location for wind potential measurements, installation of the measurement station with the appropriate measuring equipment, its maintenance and analysis of the gained data on wind potential characteristics. In this paper, a wavelet transform has been applied to analyze the wind speed data in the context of insight in the characteristics of the wind and the selection of suitable locations that could be the subject of a wind farm construction. This approach shows that it can be a useful tool in investigation of wind potential.Keywords: Wind potential, Wind speed data, Wavelettransform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26376855 A Biometric Template Security Approach to Fingerprints Based on Polynomial Transformations
Authors: Ramon Santana
Abstract:
The use of biometric identifiers in the field of information security, access control to resources, authentication in ATMs and banking among others, are of great concern because of the safety of biometric data. In the general architecture of a biometric system have been detected eight vulnerabilities, six of them allow obtaining minutiae template in plain text. The main consequence of obtaining minutia templates is the loss of biometric identifier for life. To mitigate these vulnerabilities several models to protect minutiae templates have been proposed. Several vulnerabilities in the cryptographic security of these models allow to obtain biometric data in plain text. In order to increase the cryptographic security and ease of reversibility, a minutiae templates protection model is proposed. The model aims to make the cryptographic protection and facilitate the reversibility of data using two levels of security. The first level of security is the data transformation level. In this level generates invariant data to rotation and translation, further transformation is irreversible. The second level of security is the evaluation level, where the encryption key is generated and data is evaluated using a defined evaluation function. The model is aimed at mitigating known vulnerabilities of the proposed models, basing its security on the impossibility of the polynomial reconstruction.Keywords: Fingerprint, template protection, bio-cryptography, minutiae protection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8526854 Complex-Valued Neural Network in Signal Processing: A Study on the Effectiveness of Complex Valued Generalized Mean Neuron Model
Authors: Anupama Pande, Ashok Kumar Thakur, Swapnoneel Roy
Abstract:
A complex valued neural network is a neural network which consists of complex valued input and/or weights and/or thresholds and/or activation functions. Complex-valued neural networks have been widening the scope of applications not only in electronics and informatics, but also in social systems. One of the most important applications of the complex valued neural network is in signal processing. In Neural networks, generalized mean neuron model (GMN) is often discussed and studied. The GMN includes a new aggregation function based on the concept of generalized mean of all the inputs to the neuron. This paper aims to present exhaustive results of using Generalized Mean Neuron model in a complex-valued neural network model that uses the back-propagation algorithm (called -Complex-BP-) for learning. Our experiments results demonstrate the effectiveness of a Generalized Mean Neuron Model in a complex plane for signal processing over a real valued neural network. We have studied and stated various observations like effect of learning rates, ranges of the initial weights randomly selected, error functions used and number of iterations for the convergence of error required on a Generalized Mean neural network model. Some inherent properties of this complex back propagation algorithm are also studied and discussed.Keywords: Complex valued neural network, Generalized Meanneuron model, Signal processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17366853 Index t-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings
Authors: G. Candel, D. Naccache
Abstract:
t-SNE is an embedding method that the data science community has widely used. It helps two main tasks: to display results by coloring items according to the item class or feature value; and for forensic, giving a first overview of the dataset distribution. Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. t-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric. The transformation from a high to low dimensional space is described but not learned. Two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. A naive approach would be to embed all datasets together. However, this process is costly as the complexity of t-SNE is quadratic, and would be infeasible for too many datasets. Another approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding’ match. The embedding with the support process can be repeated more than once, with the newly obtained embedding. The successive embedding can be used to study the impact of one variable over the dataset distribution or monitor changes over time. This method has the same complexity as t-SNE per embedding, and memory requirements are only doubled. For a dataset of n elements sorted and split into k subsets, the total embedding complexity would be reduced from O(n2) to O(n2/k), and the memory requirement from n2 to 2(n/k)2 which enables computation on recent laptops. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets’ dynamics.
Keywords: Concept drift, data visualization, dimension reduction, embedding, monitoring, reusability, t-SNE, unsupervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4946852 SIMGraph: Simplifying Contig Graph to Improve de Novo Genome Assembly Using Next-generation Sequencing Data
Authors: Chien-Ju Li, Chun-Hui Yu, Chi-Chuan Hwang, Tsunglin Liu , Darby Tien-Hao Chang
Abstract:
De novo genome assembly is always fragmented. Assembly fragmentation is more serious using the popular next generation sequencing (NGS) data because NGS sequences are shorter than the traditional Sanger sequences. As the data throughput of NGS is high, the fragmentations in assemblies are usually not the result of missing data. On the contrary, the assembled sequences, called contigs, are often connected to more than one other contigs in a complicated manner, leading to the fragmentations. False connections in such complicated connections between contigs, named a contig graph, are inevitable because of repeats and sequencing/assembly errors. Simplifying a contig graph by removing false connections directly improves genome assembly. In this work, we have developed a tool, SIMGraph, to resolve ambiguous connections between contigs using NGS data. Applying SIMGraph to the assembly of a fungus and a fish genome, we resolved 27.6% and 60.3% ambiguous contig connections, respectively. These results can reduce the experimental efforts in resolving contig connections.
Keywords: Contig graph, NGS, de novo assembly, scaffold.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17436851 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120