Search results for: pseudo-panel data method
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 36982

Search results for: pseudo-panel data method

36742 A Comparative Study of the Athlete Health Records' Minimum Data Set in Selected Countries and Presenting a Model for Iran

Authors: Robab Abdolkhani, Farzin Halabchi, Reza Safdari, Goli Arji

Abstract:

Background and purpose: The quality of health record depends on the quality of its content and proper documentation. Minimum data set makes a standard method for collecting key data elements that make them easy to understand and enable comparison. The aim of this study was to determine the minimum data set for Iranian athletes’ health records. Methods: This study is an applied research of a descriptive comparative type which was carried out in 2013. By using internal and external forms of documentation, a checklist was created that included data elements of athletes health record and was subjected to debate in Delphi method by experts in the field of sports medicine and health information management. Results: From 97 elements which were subjected to discussion, 85 elements by more than 75 percent of the participants (as the main elements) and 12 elements by 50 to 75 percent of the participants (as the proposed elements) were agreed upon. In about 97 elements of the case, there was no significant difference between responses of alumni groups of sport pathology and sports medicine specialists with medical record, medical informatics and information management professionals. Conclusion: Minimum data set of Iranian athletes’ health record with four information categories including demographic information, health history, assessment and treatment plan was presented. The proposed model is available for manual and electronic medical records.

Keywords: Documentation, Health record, Minimum data set, Sports medicine

Procedia PDF Downloads 444
36741 A Proposed Framework for Software Redocumentation Using Distributed Data Processing Techniques and Ontology

Authors: Laila Khaled Almawaldi, Hiew Khai Hang, Sugumaran A. l. Nallusamy

Abstract:

Legacy systems are crucial for organizations, but their intricacy and lack of documentation pose challenges for maintenance and enhancement. Redocumentation of legacy systems is vital for automatically or semi-automatically creating documentation for software lacking sufficient records. It aims to enhance system understandability, maintainability, and knowledge transfer. However, existing redocumentation methods need improvement in data processing performance and document generation efficiency. This stems from the necessity to efficiently handle the extensive and complex code of legacy systems. This paper proposes a method for semi-automatic legacy system re-documentation using semantic parallel processing and ontology. Leveraging parallel processing and ontology addresses current challenges by distributing the workload and creating documentation with logically interconnected data. The paper outlines challenges in legacy system redocumentation and suggests a method of redocumentation using parallel processing and ontology for improved efficiency and effectiveness.

Keywords: legacy systems, redocumentation, big data analysis, parallel processing

Procedia PDF Downloads 11
36740 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data

Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee

Abstract:

Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.

Keywords: data mining, evaluating new technology, technology opportunity, patent analysis

Procedia PDF Downloads 345
36739 Bayesian Analysis of Topp-Leone Generalized Exponential Distribution

Authors: Najrullah Khan, Athar Ali Khan

Abstract:

The Topp-Leone distribution was introduced by Topp- Leone in 1955. In this paper, an attempt has been made to fit Topp-Leone Generalized exponential (TPGE) distribution. A real survival data set is used for illustrations. Implementation is done using R and JAGS and appropriate illustrations are made. R and JAGS codes have been provided to implement censoring mechanism using both optimization and simulation tools. The main aim of this paper is to describe and illustrate the Bayesian modelling approach to the analysis of survival data. Emphasis is placed on the modeling of data and the interpretation of the results. Crucial to this is an understanding of the nature of the incomplete or 'censored' data encountered. Analytic approximation and simulation tools are covered here, but most of the emphasis is on Markov chain based Monte Carlo method including independent Metropolis algorithm, which is currently the most popular technique. For analytic approximation, among various optimization algorithms and trust region method is found to be the best. In this paper, TPGE model is also used to analyze the lifetime data in Bayesian paradigm. Results are evaluated from the above mentioned real survival data set. The analytic approximation and simulation methods are implemented using some software packages. It is clear from our findings that simulation tools provide better results as compared to those obtained by asymptotic approximation.

Keywords: Bayesian Inference, JAGS, Laplace Approximation, LaplacesDemon, posterior, R Software, simulation

Procedia PDF Downloads 503
36738 Fuzzy Gauge Capability (Cg and Cgk) through Buckley Approach

Authors: Seyed Habib A. Rahmati, Mohsen Sadegh Amalnick

Abstract:

Different terms of the statistical process control (SPC) has sketch in the fuzzy environment. However, measurement system analysis (MSA), as a main branch of the SPC, is rarely investigated in fuzzy area. This procedure assesses the suitability of the data to be used in later stages or decisions of the SPC. Therefore, this research focuses on some important measures of MSA and through a new method introduces the measures in fuzzy environment. In this method, which works based on Buckley approach, imprecision and vagueness nature of the real world measurement are considered simultaneously. To do so, fuzzy version of the gauge capability (Cg and Cgk) are introduced. The method is also explained through example clearly.

Keywords: measurement, SPC, MSA, gauge capability (Cg and Cgk)

Procedia PDF Downloads 612
36737 Operating Speed Models on Tangent Sections of Two-Lane Rural Roads

Authors: Dražen Cvitanić, Biljana Maljković

Abstract:

This paper presents models for predicting operating speeds on tangent sections of two-lane rural roads developed on continuous speed data. The data corresponds to 20 drivers of different ages and driving experiences, driving their own cars along an 18 km long section of a state road. The data were first used for determination of maximum operating speeds on tangents and their comparison with speeds in the middle of tangents i.e. speed data used in most of operating speed studies. Analysis of continuous speed data indicated that the spot speed data are not reliable indicators of relevant speeds. After that, operating speed models for tangent sections were developed. There was no significant difference between models developed using speed data in the middle of tangent sections and models developed using maximum operating speeds on tangent sections. All developed models have higher coefficient of determination then models developed on spot speed data. Thus, it can be concluded that the method of measuring has more significant impact on the quality of operating speed model than the location of measurement.

Keywords: operating speed, continuous speed data, tangent sections, spot speed, consistency

Procedia PDF Downloads 424
36736 Structural Health Monitoring of Buildings–Recorded Data and Wave Method

Authors: Tzong-Ying Hao, Mohammad T. Rahmani

Abstract:

This article presents the structural health monitoring (SHM) method based on changes in wave traveling times (wave method) within a layered 1-D shear beam model of structure. The wave method measures the velocity of shear wave propagating in a building from the impulse response functions (IRF) obtained from recorded data at different locations inside the building. If structural damage occurs in a structure, the velocity of wave propagation through it changes. The wave method analysis is performed on the responses of Torre Central building, a 9-story shear wall structure located in Santiago, Chile. Because events of different intensity (ambient vibrations, weak and strong earthquake motions) have been recorded at this building, therefore it can serve as a full-scale benchmark to validate the structural health monitoring method utilized. The analysis of inter-story drifts and the Fourier spectra for the EW and NS motions during 2010 Chile earthquake are presented. The results for the NS motions suggest the coupling of translation and torsion responses. The system frequencies (estimated from the relative displacement response of the 8th-floor with respect to the basement from recorded data) were detected initially decreasing approximately 24% in the EW motion. Near the end of shaking, an increase of about 17% was detected. These analysis and results serve as baseline indicators of the occurrence of structural damage. The detected changes in wave velocities of the shear beam model are consistent with the observed damage. However, the 1-D shear beam model is not sufficient to simulate the coupling of translation and torsion responses in the NS motion. The wave method is proven for actual implementation in structural health monitoring systems based on carefully assessing the resolution and accuracy of the model for its effectiveness on post-earthquake damage detection in buildings.

Keywords: Chile earthquake, damage detection, earthquake response, impulse response function, shear beam model, shear wave velocity, structural health monitoring, torre central building, wave method

Procedia PDF Downloads 346
36735 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 632
36734 A New Computational Package for Using in CFD and Other Problems (Third Edition)

Authors: Mohammad Reza Akhavan Khaleghi

Abstract:

This paper shows changes done to the Reduced Finite Element Method (RFEM) that its result will be the most powerful numerical method that has been proposed so far (some forms of this method are so powerful that they can approximate the most complex equations simply Laplace equation!). Finite Element Method (FEM) is a powerful numerical method that has been used successfully for the solution of the existing problems in various scientific and engineering fields such as its application in CFD. Many algorithms have been expressed based on FEM, but none have been used in popular CFD software. In this section, full monopoly is according to Finite Volume Method (FVM) due to better efficiency and adaptability with the physics of problems in comparison with FEM. It doesn't seem that FEM could compete with FVM unless it was fundamentally changed. This paper shows those changes and its result will be a powerful method that has much better performance in all subjects in comparison with FVM and another computational method. This method is not to compete with the finite volume method but to replace it.

Keywords: reduced finite element method, new computational package, new finite element formulation, new higher-order form, new isogeometric analysis

Procedia PDF Downloads 82
36733 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 118
36732 Effectiveness of Earthing System in Vertical Configurations

Authors: S. Yunus, A. Suratman, N. Mohamad Nor, M. Othman

Abstract:

This paper presents the measurement and simulation results by Finite Element Method (FEM) for earth resistance (RDC) for interconnected vertical ground rod configurations. The soil resistivity was measured using the Wenner four-pin Method, and RDC was measured using the Fall of Potential (FOP) method, as outlined in the standard. Genetic Algorithm (GA) is employed to interpret the soil resistivity to that of a 2-layer soil model. The same soil resistivity data that were obtained by Wenner four-pin method were used in FEM for simulation. This paper compares the results of RDC obtained by FEM simulation with the real measurement at field site. A good agreement was seen for RDC obtained by measurements and FEM. This shows that FEM is a reliable software to be used for design of earthing systems. It is also found that the parallel rod system has a better performance compared to a similar setup using a grid layout.

Keywords: earthing system, earth electrodes, finite element method, genetic algorithm, earth resistances

Procedia PDF Downloads 90
36731 Shield Tunnel Excavation Simulation of a Case Study Using a So-Called 'Stress Relaxation' Method

Authors: Shengwei Zhu, Alireza Afshani, Hirokazu Akagi

Abstract:

Ground surface settlement induced by shield tunneling is addressing increasing attention as shield tunneling becomes a popular construction technique for tunnels in urban areas. This paper discusses a 2D longitudinal FEM simulation of a tunneling case study in Japan (Tokyo Metro Yurakucho Line). Tunneling-induced field data was already collected and is used here for comparison and evaluating purposes. In this model, earth pressure, face pressure, backfilling grouting, elastic tunnel lining, and Mohr-Coulomb failure criterion for soil elements are considered. A method called ‘stress relaxation’ is also exploited to simulate the gradual tunneling excavation. Ground surface settlements obtained from numerical results using the introduced method are then compared with the measurement data.

Keywords: 2D longitudinal FEM model, tunneling case study, stress relaxation, shield tunneling excavation

Procedia PDF Downloads 309
36730 Numerical Calculation of Dynamic Response of Catamaran Vessels Based on 3D Green Function Method

Authors: Md. Moinul Islam, N. M. Golam Zakaria

Abstract:

Seakeeping analysis of catamaran vessels in the earlier stages of design has become an important issue as it dictates the seakeeping characteristics, and it ensures safe navigation during the voyage. In the present paper, a 3D numerical method for the seakeeping prediction of catamaran vessel is presented using the 3D Green Function method. Both steady and unsteady potential flow problem is dealt with here. Using 3D linearized potential theory, the dynamic wave loads and the subsequent response of the vessel is computed. For validation of the numerical procedure catamaran vessel composed of twin, Wigley form demi-hull is used. The results of the present calculation are compared with the available experimental data and also with other calculations. The numerical procedure is also carried out for NPL-based round bilge catamaran, and hydrodynamic coefficients along with heave and pitch motion responses are presented for various Froude number. The results obtained by the present numerical method are found to be in fairly good agreement with the available data. This can be used as a design tool for predicting the seakeeping behavior of catamaran ships in waves.

Keywords: catamaran, hydrodynamic coefficients , motion response, 3D green function

Procedia PDF Downloads 188
36729 A Study on the Solutions of the 2-Dimensional and Forth-Order Partial Differential Equations

Authors: O. Acan, Y. Keskin

Abstract:

In this study, we will carry out a comparative study between the reduced differential transform method, the adomian decomposition method, the variational iteration method and the homotopy analysis method. These methods are used in many fields of engineering. This is been achieved by handling a kind of 2-Dimensional and forth-order partial differential equations called the Kuramoto–Sivashinsky equations. Three numerical examples have also been carried out to validate and demonstrate efficiency of the four methods. Furthermost, it is shown that the reduced differential transform method has advantage over other methods. This method is very effective and simple and could be applied for nonlinear problems which used in engineering.

Keywords: reduced differential transform method, adomian decomposition method, variational iteration method, homotopy analysis method

Procedia PDF Downloads 408
36728 Sentiment Classification Using Enhanced Contextual Valence Shifters

Authors: Vo Ngoc Phu, Phan Thi Tuoi

Abstract:

We have explored different methods of improving the accuracy of sentiment classification. The sentiment orientation of a document can be positive (+), negative (-), or neutral (0). We combine five dictionaries from [2, 3, 4, 5, 6] into the new one with 21137 entries. The new dictionary has many verbs, adverbs, phrases and idioms, that are not in five ones before. The paper shows that our proposed method based on the combination of Term-Counting method and Enhanced Contextual Valence Shifters method has improved the accuracy of sentiment classification. The combined method has accuracy 68.984% on the testing dataset, and 69.224% on the training dataset. All of these methods are implemented to classify the reviews based on our new dictionary and the Internet Movie data set.

Keywords: sentiment classification, sentiment orientation, valence shifters, contextual, valence shifters, term counting

Procedia PDF Downloads 479
36727 Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural Networks

Authors: Wang Yichen, Haruka Yamashita

Abstract:

In recent years, in the field of sports, decision making such as member in the game and strategy of the game based on then analysis of the accumulated sports data are widely attempted. In fact, in the NBA basketball league where the world's highest level players gather, to win the games, teams analyze the data using various statistical techniques. However, it is difficult to analyze the game data for each play such as the ball tracking or motion of the players in the game, because the situation of the game changes rapidly, and the structure of the data should be complicated. Therefore, it is considered that the analysis method for real time game play data is proposed. In this research, we propose an analytical model for "determining the optimal lineup composition" using the real time play data, which is considered to be difficult for all coaches. In this study, because replacing the entire lineup is too complicated, and the actual question for the replacement of players is "whether or not the lineup should be changed", and “whether or not Small Ball lineup is adopted”. Therefore, we propose an analytical model for the optimal player selection problem based on Small Ball lineups. In basketball, we can accumulate scoring data for each play, which indicates a player's contribution to the game, and the scoring data can be considered as a time series data. In order to compare the importance of players in different situations and lineups, we combine RNN (Recurrent Neural Network) model, which can analyze time series data, and NN (Neural Network) model, which can analyze the situation on the field, to build the prediction model of score. This model is capable to identify the current optimal lineup for different situations. In this research, we collected all the data of accumulated data of NBA from 2019-2020. Then we apply the method to the actual basketball play data to verify the reliability of the proposed model.

Keywords: recurrent neural network, players lineup, basketball data, decision making model

Procedia PDF Downloads 106
36726 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 229
36725 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 391
36724 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 292
36723 Fuzzy Total Factor Productivity by Credibility Theory

Authors: Shivi Agarwal, Trilok Mathur

Abstract:

This paper proposes the method to measure the total factor productivity (TFP) change by credibility theory for fuzzy input and output variables. Total factor productivity change has been widely studied with crisp input and output variables, however, in some cases, input and output data of decision-making units (DMUs) can be measured with uncertainty. These data can be represented as linguistic variable characterized by fuzzy numbers. Malmquist productivity index (MPI) is widely used to estimate the TFP change by calculating the total factor productivity of a DMU for different time periods using data envelopment analysis (DEA). The fuzzy DEA (FDEA) model is solved using the credibility theory. The results of FDEA is used to measure the TFP change for fuzzy input and output variables. Finally, numerical examples are presented to illustrate the proposed method to measure the TFP change input and output variables. The suggested methodology can be utilized for performance evaluation of DMUs and help to assess the level of integration. The methodology can also apply to rank the DMUs and can find out the DMUs that are lagging behind and make recommendations as to how they can improve their performance to bring them at par with other DMUs.

Keywords: chance-constrained programming, credibility theory, data envelopment analysis, fuzzy data, Malmquist productivity index

Procedia PDF Downloads 330
36722 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 121
36721 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy

Authors: Abdullah Al Mamun, Talal Alkharobi

Abstract:

As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.

Keywords: big data, cloud computing, cryptography, hadoop, public key

Procedia PDF Downloads 299
36720 A Numerical Investigation of Lamb Wave Damage Diagnosis for Composite Delamination Using Instantaneous Phase

Authors: Haode Huo, Jingjing He, Rui Kang, Xuefei Guan

Abstract:

This paper presents a study of Lamb wave damage diagnosis of composite delamination using instantaneous phase data. Numerical experiments are performed using the finite element method. Different sizes of delamination damages are modeled using finite element package ABAQUS. Lamb wave excitation and responses data are obtained using a pitch-catch configuration. Empirical mode decomposition is employed to extract the intrinsic mode functions (IMF). Hilbert–Huang Transform is applied to each of the resulting IMFs to obtain the instantaneous phase information. The baseline data for healthy plates are also generated using the same procedure. The size of delamination is correlated with the instantaneous phase change for damage diagnosis. It is observed that the unwrapped instantaneous phase of shows a consistent behavior with the increasing delamination size.

Keywords: delamination, lamb wave, finite element method, EMD, instantaneous phase

Procedia PDF Downloads 297
36719 A Relationship Extraction Method from Literary Fiction Considering Korean Linguistic Features

Authors: Hee-Jeong Ahn, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The knowledge of the relationship between characters can help readers to understand the overall story or plot of the literary fiction. In this paper, we present a method for extracting the specific relationship between characters from a Korean literary fiction. Generally, methods for extracting relationships between characters in text are statistical or computational methods based on the sentence distance between characters without considering Korean linguistic features. Furthermore, it is difficult to extract the relationship with direction from text, such as one-sided love, because they consider only the weight of relationship, without considering the direction of the relationship. Therefore, in order to identify specific relationships between characters, we propose a statistical method considering linguistic features, such as syntactic patterns and speech verbs in Korean. The result of our method is represented by a weighted directed graph of the relationship between the characters. Furthermore, we expect that proposed method could be applied to the relationship analysis between characters of other content like movie or TV drama.

Keywords: data mining, Korean linguistic feature, literary fiction, relationship extraction

Procedia PDF Downloads 347
36718 The Benefits of Using Hijab Syar'i against Female Sexual Abuse

Authors: Catur Sigit Hartanto, Anggraeni Anisa Wara Rahmayanti

Abstract:

Objective: This research is aimed to assess the benefits of using hijab syar'i against female sexual abuse. Method: This research uses a quantitative study. The population is students in Semarang State University who wear hijab syar’i. The sampling technique uses the method of conformity. The retrieving data uses questionnaire on 30 female students as the sample. The data analysis uses descriptive analysis. Result: Using hijab syar’i provides benefits in preventing and minimizing female sexual abuse. Limitation: Respondents were limited to only 30 people.

Keywords: hijab syar’i, female, sexual abuse, student of Semarang State University

Procedia PDF Downloads 254
36717 Estimating the Life-Distribution Parameters of Weibull-Life PV Systems Utilizing Non-Parametric Analysis

Authors: Saleem Z. Ramadan

Abstract:

In this paper, a model is proposed to determine the life distribution parameters of the useful life region for the PV system utilizing a combination of non-parametric and linear regression analysis for the failure data of these systems. Results showed that this method is dependable for analyzing failure time data for such reliable systems when the data is scarce.

Keywords: masking, bathtub model, reliability, non-parametric analysis, useful life

Procedia PDF Downloads 532
36716 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 90
36715 Remaining Useful Life (RUL) Assessment Using Progressive Bearing Degradation Data and ANN Model

Authors: Amit R. Bhende, G. K. Awari

Abstract:

Remaining useful life (RUL) prediction is one of key technologies to realize prognostics and health management that is being widely applied in many industrial systems to ensure high system availability over their life cycles. The present work proposes a data-driven method of RUL prediction based on multiple health state assessment for rolling element bearings. Bearing degradation data at three different conditions from run to failure is used. A RUL prediction model is separately built in each condition. Feed forward back propagation neural network models are developed for prediction modeling.

Keywords: bearing degradation data, remaining useful life (RUL), back propagation, prognosis

Procedia PDF Downloads 412
36714 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 372
36713 An Embarrassingly Simple Semi-supervised Approach to Increase Recall in Online Shopping Domain to Match Structured Data with Unstructured Data

Authors: Sachin Nagargoje

Abstract:

Complete labeled data is often difficult to obtain in a practical scenario. Even if one manages to obtain the data, the quality of the data is always in question. In shopping vertical, offers are the input data, which is given by advertiser with or without a good quality of information. In this paper, an author investigated the possibility of using a very simple Semi-supervised learning approach to increase the recall of unhealthy offers (has badly written Offer Title or partial product details) in shopping vertical domain. The author found that the semisupervised learning method had improved the recall in the Smart Phone category by 30% on A=B testing on 10% traffic and increased the YoY (Year over Year) number of impressions per month by 33% at production. This also made a significant increase in Revenue, but that cannot be publicly disclosed.

Keywords: semi-supervised learning, clustering, recall, coverage

Procedia PDF Downloads 98