Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6700

Search results for: features comparison

6700 Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset

Authors: Essam Al Daoud

Abstract:

Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.

Keywords: gradient boosting, XGBoost, LightGBM, CatBoost, home credit

Procedia PDF Downloads 63
6699 An Automatic Feature Extraction Technique for 2D Punch Shapes

Authors: Awais Ahmad Khan, Emad Abouel Nasr, H. M. A. Hussein, Abdulrahman Al-Ahmari

Abstract:

Sheet-metal parts have been widely applied in electronics, communication and mechanical industries in recent decades; but the advancement in sheet-metal part design and manufacturing is still behind in comparison with the increasing importance of sheet-metal parts in modern industry. This paper presents a methodology for automatic extraction of some common 2D internal sheet metal features. The features used in this study are taken from Unipunch ™ catalogue. The extraction process starts with the data extraction from STEP file using an object oriented approach and with the application of suitable algorithms and rules, all features contained in the catalogue are automatically extracted. Since the extracted features include geometry and engineering information, they will be effective for downstream application such as feature rebuilding and process planning.

Keywords: feature extraction, internal features, punch shapes, sheet metal

Procedia PDF Downloads 499
6698 Human-Machine Cooperation in Facial Comparison Based on Likelihood Scores

Authors: Lanchi Xie, Zhihui Li, Zhigang Li, Guiqiang Wang, Lei Xu, Yuwen Yan

Abstract:

Image-based facial features can be classified into category recognition features and individual recognition features. Current automated face recognition systems extract a specific feature vector of different dimensions from a facial image according to their pre-trained neural network. However, to improve the efficiency of parameter calculation, an algorithm generally reduces the image details by pooling. The operation will overlook the details concerned much by forensic experts. In our experiment, we adopted a variety of face recognition algorithms based on deep learning, compared a large number of naturally collected face images with the known data of the same person's frontal ID photos. Downscaling and manual handling were performed on the testing images. The results supported that the facial recognition algorithms based on deep learning detected structural and morphological information and rarely focused on specific markers such as stains and moles. Overall performance, distribution of genuine scores and impostor scores, and likelihood ratios were tested to evaluate the accuracy of biometric systems and forensic experts. Experiments showed that the biometric systems were skilled in distinguishing category features, and forensic experts were better at discovering the individual features of human faces. In the proposed approach, a fusion was performed at the score level. At the specified false accept rate, the framework achieved a lower false reject rate. This paper contributes to improving the interpretability of the objective method of facial comparison and provides a novel method for human-machine collaboration in this field.

Keywords: likelihood ratio, automated facial recognition, facial comparison, biometrics

Procedia PDF Downloads 36
6697 Forensic Comparison of Facial Images for Human Identification

Authors: D. P. Gangwar

Abstract:

Identification of human through facial images has got great importance in forensic science. The video recordings, CCTV footage, passports, driver licenses and other related documents are invariably sent to the laboratory for comparison of the questioned photographs as well as video recordings with suspected photographs/recordings to prove the identity of a person. More than 300 questioned and 300 control photographs received in actual crime cases, received from various investigation agencies, have been compared by me so far using various familiar analysis and comparison techniques such as Holistic comparison, Morphological analysis, Photo-anthropometry and superimposition. On the basis of findings obtained during the examination huge photo exhibits, a realistic and comprehensive technique has been proposed which could be very useful for forensic.

Keywords: CCTV Images, facial features, photo-anthropometry, superimposition

Procedia PDF Downloads 438
6696 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 206
6695 Return of Equity and Labor Productivity Comparison on Some Sino-Foreign Commercial Banks

Authors: Xiaojun Wang

Abstract:

In a lucky emerging market, most Sino commercial banks has developed rapidly and achieved dazzling performance in recent years. As a large sound commercial bank with long history, Wells Fargo Company(WFC) is taken as a mirror in this paper in order to roughly find out the relevance on life circle of the Sino banks in comparison with WFC. Two financial measures return on equity(ROE) and overall labor productivity(OLP), three commercial banks the Hong Kong and Shanghai Banking Corporation Limited(HSBC), the Bank of Communication(BCM) and China Minsheng Bank(CMSB) are selected. The comparison data coming from historical annual reports of each company vary from 13 years to 51 years. Several conclusions from the results indicate that most Sino commercial banks would be continually developing with lower financial measures performance for later several decades.

Keywords: commercial bank, features comparison, labor productivity, return on equity

Procedia PDF Downloads 109
6694 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 62
6693 Semantic Features of Turkish and Spanish Phraseological Units with a Somatic Component ‘Hand’

Authors: Narmina Mammadova

Abstract:

In modern linguistics, the comparative study of languages is becoming increasingly popular, the typology and comparison of languages that have different structures is expanding and deepening. Of particular interest is the study of phraseological units, which makes it possible to identify the specific features of the compared languages in all their national identity. This paper gives a brief analysis of the comparative study of somatic phraseological units (SFU) of the Spanish and Turkish languages with the component "hand" in the semantic aspect; identification of equivalents, analogs and non-equivalent units, as well as a description of methods of translation of non-equivalent somatic phraseological units. Comparative study of the phraseology of unrelated languages is of particular relevance since it allows us to identify both general, universal features and differential and specific features characteristic of a particular language. Based on the results of the generalization of the study, it can be assumed that phraseological units containing a somatic component have a high interlingual phraseological activity, which contributes to an increase in the degree of interlingual equivalence.

Keywords: Linguoculturology, Turkish, Spanish, language picture of the world, phraseological units, semantic microfield

Procedia PDF Downloads 2
6692 The Experience with SiC MOSFET and Buck Converter Snubber Design

Authors: Petr Vaculik

Abstract:

The newest semiconductor devices on the market are MOSFET transistors based on the silicon carbide – SiC. This material has exclusive features thanks to which it becomes a better switch than Si – silicon semiconductor switch. There are some special features that need to be understood to enable the device’s use to its full potential. The advantages and differences of SiC MOSFETs in comparison with Si IGBT transistors have been described in first part of this article. Second part describes driver for SiC MOSFET transistor and last part of article represents SiC MOSFET in the application of buck converter (step-down) and design of simple RC snubber.

Keywords: SiC, Si, MOSFET, IGBT, SBD, RC snubber

Procedia PDF Downloads 287
6691 Tree Species Classification Using Effective Features of Polarimetric SAR and Hyperspectral Images

Authors: Milad Vahidi, Mahmod R. Sahebi, Mehrnoosh Omati, Reza Mohammadi

Abstract:

Forest management organizations need information to perform their work effectively. Remote sensing is an effective method to acquire information from the Earth. Two datasets of remote sensing images were used to classify forested regions. Firstly, all of extractable features from hyperspectral and PolSAR images were extracted. The optical features were spectral indexes related to the chemical, water contents, structural indexes, effective bands and absorption features. Also, PolSAR features were the original data, target decomposition components, and SAR discriminators features. Secondly, the particle swarm optimization (PSO) and the genetic algorithms (GA) were applied to select optimization features. Furthermore, the support vector machine (SVM) classifier was used to classify the image. The results showed that the combination of PSO and SVM had higher overall accuracy than the other cases. This combination provided overall accuracy about 90.56%. The effective features were the spectral index, the bands in shortwave infrared (SWIR) and the visible ranges and certain PolSAR features.

Keywords: hyperspectral, PolSAR, feature selection, SVM

Procedia PDF Downloads 95
6690 Investigations of Protein Aggregation Using Sequence and Structure Based Features

Authors: M. Michael Gromiha, A. Mary Thangakani, Sandeep Kumar, D. Velmurugan

Abstract:

The main cause of several neurodegenerative diseases such as Alzhemier, Parkinson, and spongiform encephalopathies is formation of amyloid fibrils and plaques in proteins. We have analyzed different sets of proteins and peptides to understand the influence of sequence-based features on protein aggregation process. The comparison of 373 pairs of homologous mesophilic and thermophilic proteins showed that aggregation-prone regions (APRs) are present in both. But, the thermophilic protein monomers show greater ability to ‘stow away’ the APRs in their hydrophobic cores and protect them from solvent exposure. The comparison of amyloid forming and amorphous b-aggregating hexapeptides suggested distinct preferences for specific residues at the six positions as well as all possible combinations of nine residue pairs. The compositions of residues at different positions and residue pairs have been converted into energy potentials and utilized for distinguishing between amyloid forming and amorphous b-aggregating peptides. Our method could correctly identify the amyloid forming peptides at an accuracy of 95-100% in different datasets of peptides.

Keywords: aggregation, amyloids, thermophilic proteins, amino acid residues, machine learning techniques

Procedia PDF Downloads 495
6689 2D Point Clouds Features from Radar for Helicopter Classification

Authors: Danilo Habermann, Aleksander Medella, Carla Cremon, Yusef Caceres

Abstract:

This paper aims to analyze the ability of 2d point clouds features to classify different models of helicopters using radars. This method does not need to estimate the blade length, the number of blades of helicopters, and the period of their micro-Doppler signatures. It is also not necessary to generate spectrograms (or any other image based on time and frequency domain). This work transforms a radar return signal into a 2D point cloud and extracts features of it. Three classifiers are used to distinguish 9 different helicopter models in order to analyze the performance of the features used in this work. The high accuracy obtained with each of the classifiers demonstrates that the 2D point clouds features are very useful for classifying helicopters from radar signal.

Keywords: helicopter classification, point clouds features, radar, supervised classifiers

Procedia PDF Downloads 80
6688 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 173
6687 New Features for Copy-Move Image Forgery Detection

Authors: Michael Zimba

Abstract:

A novel set of features for copy-move image forgery, CMIF, detection method is proposed. The proposed set presents a new approach which relies on electrostatic field theory, EFT. Solely for the purpose of reducing the dimension of a suspicious image, firstly performs discrete wavelet transform, DWT, of the suspicious image and extracts only the approximation subband. The extracted subband is then bijectively mapped onto a virtual electrostatic field where concepts of EFT are utilised to extract robust features. The extracted features are shown to be invariant to additive noise, JPEG compression, and affine transformation. The proposed features can also be used in general object matching.

Keywords: virtual electrostatic field, features, affine transformation, copy-move image forgery

Procedia PDF Downloads 417
6686 Profit and Nonprofit Sports Clubs, Financial and Organizational Comparison in Poland

Authors: Igor Perechuda, Wojciech Cieśliński

Abstract:

The paper identifies the features of Polish sports clubs in the particular organizational forms: profit and nonprofit. Identification and description of these features is carried out in terms of financial efficiency of the given organizational form. Under the terms of the efficiency the research allows you to specify the advantages of particular organizational sports club form and the following limitations. Paper considers features of sports clubs in range of Polish conditions as legal regulations. The sources of the functioning efficiency of sports clubs may lie in the organizational forms in which they operate. Each of the available forms can be considered either a for-profit or nonprofit enterprise. Depending on this classification there are different capabilities of increasing organizational and financial efficiency of a given sports club. Authors start with general classification and difference between for-profit and non-profit sport clubs. Next identifies specific financial and organizational conditions of both organizational form and then show examples of mixed activity forms and their efficiency effect.

Keywords: financial efficiency, for-profit, non-profit, sports club

Procedia PDF Downloads 432
6685 Evaluation of Easy-to-Use Energy Building Design Tools for Solar Access Analysis in Urban Contexts: Comparison of Friendly Simulation Design Tools for Architectural Practice in the Early Design Stage

Authors: M. Iommi, G. Losco

Abstract:

Current building sector is focused on reduction of energy requirements, on renewable energy generation and on regeneration of existing urban areas. These targets need to be solved with a systemic approach, considering several aspects simultaneously such as climate conditions, lighting conditions, solar radiation, PV potential, etc. The solar access analysis is an already known method to analyze the solar potentials, but in current years, simulation tools have provided more effective opportunities to perform this type of analysis, in particular in the early design stage. Nowadays, the study of the solar access is related to the easiness of the use of simulation tools, in rapid and easy way, during the design process. This study presents a comparison of three simulation tools, from the point of view of the user, with the aim to highlight differences in the easy-to-use of these tools. Using a real urban context as case study, three tools; Ecotect, Townscope and Heliodon, are tested, performing models and simulations and examining the capabilities and output results of solar access analysis. The evaluation of the ease-to-use of these tools is based on some detected parameters and features, such as the types of simulation, requirements of input data, types of results, etc. As a result, a framework is provided in which features and capabilities of each tool are shown. This framework shows the differences among these tools about functions, features and capabilities. The aim of this study is to support users and to improve the integration of simulation tools for solar access with the design process.

Keywords: energy building design tools, solar access analysis, solar potential, urban planning

Procedia PDF Downloads 246
6684 Using Reservoir Models for Monitoring Geothermal Surface Features

Authors: John P. O’Sullivan, Thomas M. P. Ratouis, Michael J. O’Sullivan

Abstract:

As the use of geothermal energy grows internationally more effort is required to monitor and protect areas with rare and important geothermal surface features. A number of approaches are presented for developing and calibrating numerical geothermal reservoir models that are capable of accurately representing geothermal surface features. The approaches are discussed in the context of cases studies of the Rotorua geothermal system and the Orakei-korako geothermal system, both of which contain important surface features. The results show that models are able to match the available field data accurately and hence can be used as valuable tools for predicting the future response of the systems to changes in use.

Keywords: geothermal reservoir models, surface features, monitoring, TOUGH2

Procedia PDF Downloads 291
6683 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 214
6682 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation

Procedia PDF Downloads 218
6681 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 344
6680 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 34
6679 A Comparison of Computational and Experimental Data to Investigate the Influence of the Tangential Velocity of Inner Rotating Wall on Axial Velocity Profile of Flow through Vertical Annular Pipe with Rotating Inner Surface

Authors: Abdusalam Sharf

Abstract:

In the oil and gas industries, one of the most important issues in drilling wells is understanding the behavior of a flow through an annulus gap in a vertical position, whose outer wall is stationary whilst the inner wall rotates. The main emphasis is placed on a comparison of experimental and computational investigations into the effects of the rotation speed of the inner pipe on the axial velocity profiles. The computational investigations were carried out by employing CFD software, and Gambit and Fluent. Three turbulence models were used: standard, RNG with enhanced wall treatment, and SST model. The profiles of the axial velocity had investigated at different rotation speeds of the inner pipe with three different volumetric flow rates. The comparison results showed that the calculations satisfactorily predict the qualitative features of the axial and swirl velocity profiles and the RNG model performs the best results.

Keywords: computational fluid dynamics (CFD), SST k−ω shear-stress transport (k−ω mode variant), RNG k–ε renormalisation group (k−ε mode variant), y+ dimensionless distance from wall

Procedia PDF Downloads 281
6678 Microscopic Features Influences on Textile Fabrics Self-Cleaning Ability

Authors: Ayat Adnan Atwah

Abstract:

Self-cleaning ability in textile fabrics was comprehensively investigated in the last decade. Most of these investigations have used surface roughness, and low surface energy features to establish a self-cleaning mechanism. Extensive research articles and reviews have been published to describe these processes along with their microscopic features. When these reviewed with a critical eye, it has been found that a comprehensive effort is still required to compile all these previous research, emphasizing how textile fabrics' microscopic features can influence their self-cleaning ability. No research has been conducted to explore the self-cleaning potential of microscopic geometrical features of fabric at the woven structural level. Researchers used microscopic features to increase the mechanical strength of the fabric. However, they did not change the microscopic features at a woven level to evaluate the self-cleaning ability. In the existing literature, researchers have tried to develop self-cleaning textiles with the help of coatings on the fabric. These coatings are applied to the fabrics by using spray and nanoparticle processing. The coatings create a different surface on the fabric, and hence the changes in the microscopic features of this surface control the self-cleaning ability. Instead of using an additional coating, the microscopic features of the fabric itself can also influence the surface roughness and low surface energy and provide self-cleaning ability at the woven structural level. Key microscopic features like surface roughness, porosity, and wettability of a textile fabric are still not comprehensively investigated for their influence on fabric’s self-cleaning ability. Significantly, the interdependencies between these features with overall fabric geometry at the woven level have not been explored quantitatively. Qualitative observations have been made mainly in the past literature. However, fabrics with self-cleaning ability to be produced in mass production require extensive empirical studies. These studies must involve parametric analysis on varying values of the microscopic features and their quantitative influence on the desired self-cleaning feature.

Keywords: self-cleaning ability, influence, microscopic features, textile fabrics

Procedia PDF Downloads 47
6677 Neural Graph Matching for Modification Similarity Applied to Electronic Document Comparison

Authors: Po-Fang Hsu, Chiching Wei

Abstract:

In this paper, we present a novel neural graph matching approach applied to document comparison. Document comparison is a common task in the legal and financial industries. In some cases, the most important differences may be the addition or omission of words, sentences, clauses, or paragraphs. However, it is a challenging task without recording or tracing the whole edited process. Under many temporal uncertainties, we explore the potentiality of our approach to proximate the accurate comparison to make sure which element blocks have a relation of edition with others. In the beginning, we apply a document layout analysis that combines traditional and modern technics to segment layouts in blocks of various types appropriately. Then we transform this issue into a problem of layout graph matching with textual awareness. Regarding graph matching, it is a long-studied problem with a broad range of applications. However, different from previous works focusing on visual images or structural layout, we also bring textual features into our model for adapting this domain. Specifically, based on the electronic document, we introduce an encoder to deal with the visual presentation decoding from PDF. Additionally, because the modifications can cause the inconsistency of document layout analysis between modified documents and the blocks can be merged and split, Sinkhorn divergence is adopted in our neural graph approach, which tries to overcome both these issues with many-to-many block matching. We demonstrate this on two categories of layouts, as follows., legal agreement and scientific articles, collected from our real-case datasets.

Keywords: document comparison, graph matching, graph neural network, modification similarity, multi-modal

Procedia PDF Downloads 21
6676 Using New Machine Algorithms to Classify Iranian Musical Instruments According to Temporal, Spectral and Coefficient Features

Authors: Ronak Khosravi, Mahmood Abbasi Layegh, Siamak Haghipour, Avin Esmaili

Abstract:

In this paper, a study on classification of musical woodwind instruments using a small set of features selected from a broad range of extracted ones by the sequential forward selection method was carried out. Firstly, we extract 42 features for each record in the music database of 402 sound files belonging to five different groups of Flutes (end blown and internal duct), Single –reed, Double –reed (exposed and capped), Triple reed and Quadruple reed. Then, the sequential forward selection method is adopted to choose the best feature set in order to achieve very high classification accuracy. Two different classification techniques of support vector machines and relevance vector machines have been tested out and an accuracy of up to 96% can be achieved by using 21 time, frequency and coefficient features and relevance vector machine with the Gaussian kernel function.

Keywords: coefficient features, relevance vector machines, spectral features, support vector machines, temporal features

Procedia PDF Downloads 189
6675 Research on Perceptual Features of Couchsurfers on New Hospitality Tourism Platform Couchsurfing

Authors: Yuanxiang Miao

Abstract:

This paper aims to examine the perceptual features of couchsurfers on a new hospitality tourism platform, the free homestay website couchsurfing. As a local host, the author has accepted 61 couchsurfers in Kyoto, Japan, and attempted to figure out couchsurfers' characteristics on perception by hosting them. Moreover, the methodology of this research is mainly based on in-depth interviews, by talking with couchsurfers, observing their behaviors, doing questionnaires, etc. Five dominant perceptual features of couchsurfers were identified: (1) Trusting; (2) Meeting; (3) Sharing; (4) Reciprocity; (5) Worries. The value of this research lies in figuring out a deeper understanding of the perceptual features of couchsurfers, and the author indeed hosted and stayed with 61 couchsurfers from 30 countries and areas over one year. Lastly, the author offers practical suggestions for future research.

Keywords: couchsurfing, depth interview, hospitality tourism, perceptual features

Procedia PDF Downloads 40
6674 Unsupervised Neural Architecture for Saliency Detection

Authors: Natalia Efremova, Sergey Tarasenko

Abstract:

We propose a novel neural network architecture for visual saliency detections, which utilizes neuro physiologically plausible mechanisms for extraction of salient regions. The model has been significantly inspired by recent findings from neuro physiology and aimed to simulate the bottom-up processes of human selective attention. Two types of features were analyzed: color and direction of maximum variance. The mechanism we employ for processing those features is PCA, implemented by means of normalized Hebbian learning and the waves of spikes. To evaluate performance of our model we have conducted psychological experiment. Comparison of simulation results with those of experiment indicates good performance of our model.

Keywords: neural network models, visual saliency detection, normalized Hebbian learning, Oja's rule, psychological experiment

Procedia PDF Downloads 274
6673 The Latent Model of Linguistic Features in Korean College Students’ L2 Argumentative Writings: Syntactic Complexity, Lexical Complexity, and Fluency

Authors: Jiyoung Bae, Gyoomi Kim

Abstract:

This study explores a range of linguistic features used in Korean college students’ argumentative writings for the purpose of developing a model that identifies variables which predict writing proficiencies. This study investigated the latent variable structure of L2 linguistic features, including syntactic complexity, the lexical complexity, and fluency. One hundred forty-six university students in Korea participated in this study. The results of the study’s confirmatory factor analysis (CFA) showed that indicators of linguistic features from this study-provided a foundation for re-categorizing indicators found in extant research on L2 Korean writers depending on each latent variable of linguistic features. The CFA models indicated one measurement model of L2 syntactic complexity and L2 learners’ writing proficiency; these two latent factors were correlated with each other. Based on the overall findings of the study, integrated linguistic features of L2 writings suggested some pedagogical implications in L2 writing instructions.

Keywords: linguistic features, syntactic complexity, lexical complexity, fluency

Procedia PDF Downloads 57
6672 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine

Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour

Abstract:

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Keywords: decision tree, feature selection, intrusion detection system, support vector machine

Procedia PDF Downloads 153
6671 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 53