Search results for: word pair
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 384

Search results for: word pair

54 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910
53 Displacement Solution for a Static Vertical Rigid Movement of an Interior Circular Disc in a Transversely Isotropic Tri-Material Full-Space

Authors: D. Mehdizadeh, M. Rahimian, M. Eskandari-Ghadi

Abstract:

This article is concerned with the determination of the static interaction of a vertically loaded rigid circular disc embedded at the interface of a horizontal layer sandwiched in between two different transversely isotropic half-spaces called as tri-material full-space. The axes of symmetry of different regions are assumed to be normal to the horizontal interfaces and parallel to the movement direction. With the use of a potential function method, and by implementing Hankel integral transforms in the radial direction, the government partial differential equation for the solely scalar potential function is transformed to an ordinary 4th order differential equation, and the mixed boundary conditions are transformed into a pair of integral equations called dual integral equations, which can be reduced to a Fredholm integral equation of the second kind, which is solved analytically. Then, the displacements and stresses are given in the form of improper line integrals, which is due to inverse Hankel integral transforms. It is shown that the present solutions are in exact agreement with the existing solutions for a homogeneous full-space with transversely isotropic material. To confirm the accuracy of the numerical evaluation of the integrals involved, the numerical results are compared with the solutions exists for the homogeneous full-space. Then, some different cases with different degrees of material anisotropy are compared to portray the effect of degree of anisotropy.

 

Keywords: Transversely isotropic, rigid disc, elasticity, dual integral equations, tri-material full-space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1675
52 Identification of Most Frequently Occurring Lexis in Winnings-announcing Unsolicited Bulke-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of nearly 3000 winnings-announcing UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the winnings-announcing UBE documents. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexisset in the given UBE and the probability that the given UBE will be the one announcing fake winnings. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in winningsannouncing UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Lexis, Unsolicited Bulk e-mail (UBE), Vector SpaceDocument Model, Winnings, Lottery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539
51 Reduced Dynamic Time Warping for Handwriting Recognition Based on Multidimensional Time Series of a Novel Pen Device

Authors: Muzaffar Bashir, Jürgen Kempf

Abstract:

The purpose of this paper is to present a Dynamic Time Warping technique which reduces significantly the data processing time and memory size of multi-dimensional time series sampled by the biometric smart pen device BiSP. The acquisition device is a novel ballpoint pen equipped with a diversity of sensors for monitoring the kinematics and dynamics of handwriting movement. The DTW algorithm has been applied for time series analysis of five different sensor channels providing pressure, acceleration and tilt data of the pen generated during handwriting on a paper pad. But the standard DTW has processing time and memory space problems which limit its practical use for online handwriting recognition. To face with this problem the DTW has been applied to the sum of the five sensor signals after an adequate down-sampling of the data. Preliminary results have shown that processing time and memory size could significantly be reduced without deterioration of performance in single character and word recognition. Further excellent accuracy in recognition was achieved which is mainly due to the reduced dynamic time warping RDTW technique and a novel pen device BiSP.

Keywords: Biometric character recognition, biometric person authentication, biometric smart pen BiSP, dynamic time warping DTW, online-handwriting recognition, multidimensional time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2406
50 Testing Loaded Programs Using Fault Injection Technique

Authors: S. Manaseer, F. A. Masooud, A. A. Sharieh

Abstract:

Fault tolerance is critical in many of today's large computer systems. This paper focuses on improving fault tolerance through testing. Moreover, it concentrates on the memory faults: how to access the editable part of a process memory space and how this part is affected. A special Software Fault Injection Technique (SFIT) is proposed for this purpose. This is done by sequentially scanning the memory of the target process, and trying to edit maximum number of bytes inside that memory. The technique was implemented and tested on a group of programs in software packages such as jet-audio, Notepad, Microsoft Word, Microsoft Excel, and Microsoft Outlook. The results from the test sample process indicate that the size of the scanned area depends on several factors. These factors are: process size, process type, and virtual memory size of the machine under test. The results show that increasing the process size will increase the scanned memory space. They also show that input-output processes have more scanned area size than other processes. Increasing the virtual memory size will also affect the size of the scanned area but to a certain limit.

Keywords: Complex software systems, Error detection, Fault tolerance, Injection and testing methodology, Memory faults, Process and virtual memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886
49 Comparative Correlation Investigation of Polynuclear Aromatic Hydrocarbons (PAHs) in Soils of Different Land Use: Sources Evaluation Perspective

Authors: O. Onoriode Emoyan, E. Eyitemi Akporhonor, Charles Otobrise

Abstract:

Polycyclic Aromatic Hydrocarbons (PAHs) are formed mainly because of incomplete combustion of organic materials during industrial, domestic activities or natural occurrence. Their toxicity and contamination of terrestrial and aquatic ecosystem have been established. However, with limited validity index, previous research has focused on PAHs isomer pair ratios of variable physicochemical properties in source identification. The objective of this investigation was to determine the empirical validity of Pearson Correlation Coefficient (PCC) and Cluster Analysis (CA) in PAHs source identification along soil samples of different land uses. Therefore, 16 PAHs grouped, as Endocrine Disruption Substances (EDSs) were determined in 10 sample stations in top and sub soils seasonally. PAHs was determined the use of Varian 300 gas chromatograph interfaced with flame ionization detector. Instruments and reagents used are of standard and chromatographic grades respectively. PCC and CA results showed that the classification of PAHs along pyrolitic and petrogenic organics used in source signature is about the predominance PAHs in environmental matrix. Therefore, the distribution of PAHs in the studied stations revealed the presence of trace quantities of the vast majority of the sixteen PAHs, which may ultimately inhabit the actual source signature authentication. Therefore, factors to be considered when evaluating possible sources of PAHs could be; type and extent of bacterial metabolism, transformation products/substrates, and environmental factors such as salinity, pH, oxygen concentration, nutrients, light intensity, temperature, co-substrates, and environmental medium are hereby recommended as factors to be considered when evaluating possible sources of PAHs.

Keywords: Comparative correlation, kinetically, polynuclear aromatic hydrocarbons, thermodynamically- favored PAHs, sources evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1984
48 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1659
47 Optimization of R507A-R23 Cascade Refrigeration System using Genetic Algorithm

Authors: A. D. Parekh, P. R. Tailor, H.R Jivanramajiwala

Abstract:

The present work deals with optimization of cascade refrigeration system using eco friendly refrigerants pair R507A and R23. R507A is azeotropic mixture composed of HFC refrigerants R125/R143a (50%/50% by wt.). R23 is a single component HFC refrigerant used as replacement to CFC refrigerant R13 in low temperature applications. These refrigerants have zero ozone depletion potential and are non-flammable. Optimization of R507AR23 cascade refrigeration system performance parameters such as minimum work required, refrigeration effect, coefficient of performance and exergetic efficiency was carried out in terms of eight operating parameters- combinations using Genetic Algorithm tool. The eight operating parameters include (1) low side evaporator temperature (2) high side condenser temperature (3) temperature difference in the cascade heat exchanger (4) low side condenser temperature (5) low side degree of subcooling (6) high side degree of subcooling (7) low side degree of superheating (8) high side degree of superheating. Results show that for minimum work system should operate at high temperature in low side evaporator, low temperature in high side condenser, low temperature difference in cascade condenser, high temperature in low side condenser and low degree of subcooling and superheating in both side. For maximum refrigeration effect system should operate at high temperature in low side evaporator, high temperature in high side condenser, high temperature difference in cascade condenser, low temperature in low side condenser and higher degree of subcooling in LT and HT side. For maximum coefficient of performance and exergetic efficiency, system should operate at high temperature in low side evaporator, low temperature in high side condenser, low temperature difference in cascade condenser, high temperature in low side condenser and higher degree of subcooling and superheating in low side of the system.

Keywords: Cascade refrigeration system, Genetic Algorithm, R507A, R23,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
46 Evaluation of the Analytic for Hemodynamic Instability as A Prediction Tool for Early Identification of Patient Deterioration

Authors: Bryce Benson, Sooin Lee, Ashwin Belle

Abstract:

Unrecognized or delayed identification of patient deterioration is a key cause of in-hospitals adverse events. Clinicians rely on vital signs monitoring to recognize patient deterioration. However, due to ever increasing nursing workloads and the manual effort required, vital signs tend to be measured and recorded intermittently, and inconsistently causing large gaps during patient monitoring. Additionally, during deterioration, the body’s autonomic nervous system activates compensatory mechanisms causing the vital signs to be lagging indicators of underlying hemodynamic decline. This study analyzes the predictive efficacy of the Analytic for Hemodynamic Instability (AHI) system, an automated tool that was designed to help clinicians in early identification of deteriorating patients. The lead time analysis in this retrospective observational study assesses how far in advance AHI predicted deterioration prior to the start of an episode of hemodynamic instability (HI) becoming evident through vital signs? Results indicate that of the 362 episodes of HI in this study, 308 episodes (85%) were correctly predicted by the AHI system with a median lead time of 57 minutes and an average of 4 hours (240.5 minutes). Of the 54 episodes not predicted, AHI detected 45 of them while the episode of HI was ongoing. Of the 9 undetected, 5 were not detected by AHI due to either missing or noisy input ECG data during the episode of HI. In total, AHI was able to either predict or detect 98.9% of all episodes of HI in this study. These results suggest that AHI could provide an additional ‘pair of eyes’ on patients, continuously filling the monitoring gaps and consequently giving the patient care team the ability to be far more proactive in patient monitoring and adverse event management.

Keywords: Clinical deterioration prediction, decision support system, early warning system, hemodynamic status, physiologic monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 450
45 Inferring Hierarchical Pronunciation Rules from a Phonetic Dictionary

Authors: Erika Pigliapoco, Valerio Freschi, Alessandro Bogliolo

Abstract:

This work presents a new phonetic transcription system based on a tree of hierarchical pronunciation rules expressed as context-specific grapheme-phoneme correspondences. The tree is automatically inferred from a phonetic dictionary by incrementally analyzing deeper context levels, eventually representing a minimum set of exhaustive rules that pronounce without errors all the words in the training dictionary and that can be applied to out-of-vocabulary words. The proposed approach improves upon existing rule-tree-based techniques in that it makes use of graphemes, rather than letters, as elementary orthographic units. A new linear algorithm for the segmentation of a word in graphemes is introduced to enable outof- vocabulary grapheme-based phonetic transcription. Exhaustive rule trees provide a canonical representation of the pronunciation rules of a language that can be used not only to pronounce out-of-vocabulary words, but also to analyze and compare the pronunciation rules inferred from different dictionaries. The proposed approach has been implemented in C and tested on Oxford British English and Basic English. Experimental results show that grapheme-based rule trees represent phonetically sound rules and provide better performance than letter-based rule trees.

Keywords: Automatic phonetic transcription, pronunciation rules, hierarchical tree inference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925
44 Preliminary Study of the Phonological Development in Three- and Four-Year-Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of a research of phonological processes in three- and four-year-old children. A test, created for the purpose of the study, was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the research is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, elision of sound, metathesis of sound, elision of syllable, elision of consonants clustered in a syllable. Measuring the correlation between average length of repeated speech and average length of generated speech, the analysis does not prove that the more words a child can repeat in part “repeated speech”, the more words they can be expected to generate in part “generating sentence”. The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: Articulation, phonology, speech, language development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 384
43 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3217
42 Principle Components Updates via Matrix Perturbations

Authors: Aiman Elragig, Hanan Dreiwi, Dung Ly, Idriss Elmabrook

Abstract:

This paper highlights a new approach to look at online principle components analysis (OPCA). Given a data matrix X R,^m x n we characterise the online updates of its covariance as a matrix perturbation problem. Up to the principle components, it turns out that online updates of the batch PCA can be captured by symmetric matrix perturbation of the batch covariance matrix. We have shown that as n→ n0 >> 1, the batch covariance and its update become almost similar. Finally, utilize our new setup of online updates to find a bound on the angle distance of the principle components of X and its update.

Keywords: Online data updates, covariance matrix, online principle component analysis (OPCA), matrix perturbation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1038
41 Investigating Iraqi EFL University Students' Productive Knowledge of Grammatical Collocations in English

Authors: Adnan Z. Mkhelif

Abstract:

Grammatical collocations (GCs) are word combinations containing a preposition or a grammatical structure, such as an infinitive (e.g. smile at, interested in, easy to learn, etc.). Such collocations tend to be difficult for Iraqi EFL university students (IUS) to master. To help address this problem, it is important to identify the factors causing it. This study aims at investigating the effects of L2 proficiency, frequency of GCs and their transparency on IUSs’ productive knowledge of GCs. The study involves 112 undergraduate participants with different proficiency levels, learning English in formal contexts in Iraq. The data collection instruments include (but not limited to) a productive knowledge test (designed by the researcher using the British National Corpus (BNC)), as well as the grammar part of the Oxford Placement Test (OPT). The study findings have shown that all the above-mentioned factors have significant effects on IUSs’ productive knowledge of GCs. In addition to establishing evidence of which factors of L2 learning might be relevant to learning GCs, it is hoped that the findings of the present study will contribute to more effective methods of teaching that can better address and help overcome the problems IUSs encounter in learning GCs. The study is thus hoped to have significant theoretical and pedagogical implications for researchers, syllabus designers as well as teachers of English as a foreign/second language.

Keywords: Corpus linguistics, frequency, grammatical collocations, L2 vocabulary learning, productive knowledge, proficiency, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 868
40 Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Body Enhancement, Lexis, Medicinal, Unsolicited Bulk e-mail (UBE), Vector Space Document Model, Viagra

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3508
39 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic speech recognition system, children speech, adaptation, Malay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
38 Individual Differences and Paired Learning in Virtual Environments

Authors: Patricia M. Boechler, Heather M. Gautreau

Abstract:

In this research study, postsecondary students completed an information learning task in an avatar-based 3D virtual learning environment. Three factors were of interest in relation to learning; 1) the influence of collaborative vs. independent conditions, 2) the influence of the spatial arrangement of the virtual environment (linear, random and clustered), and 3) the relationship of individual differences such as spatial skill, general computer experience and video game experience to learning. Students completed pretest measures of prior computer experience and prior spatial skill. Following the premeasure administration, students were given instruction to move through the virtual environment and study all the material within 10 information stations. In the collaborative condition, students proceeded in randomly assigned pairs, while in the independent condition they proceeded alone. After this learning phase, all students individually completed a multiple choice test to determine information retention. The overall results indicated that students in pairs did not perform any better or worse than independent students. As far as individual differences, only spatial ability predicted the performance of students. General computer experience and video game experience did not. Taking a closer look at the pairs and spatial ability, comparisons were made on pairs high/matched spatial ability, pairs low/matched spatial ability and pairs that were mismatched on spatial ability. The results showed that both high/matched pairs and mismatched pairs outperformed low/matched pairs. That is, if a pair had even one individual with strong spatial ability they would perform better than pairs with only low spatial ability individuals. This suggests that, in virtual environments, the specific individuals that are paired together are important for performance outcomes. The paper also includes a discussion of trends within the data that have implications for virtual environment education.

Keywords: Avatar-based, virtual environment, paired learning, individual differences.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 779
37 Variational Explanation Generator: Generating Explanation for Natural Language Inference Using Variational Auto-Encoder

Authors: Zhen Cheng, Xinyu Dai, Shujian Huang, Jiajun Chen

Abstract:

Recently, explanatory natural language inference has attracted much attention for the interpretability of logic relationship prediction, which is also known as explanation generation for Natural Language Inference (NLI). Existing explanation generators based on discriminative Encoder-Decoder architecture have achieved noticeable results. However, we find that these discriminative generators usually generate explanations with correct evidence but incorrect logic semantic. It is due to that logic information is implicitly encoded in the premise-hypothesis pairs and difficult to model. Actually, logic information identically exists between premise-hypothesis pair and explanation. And it is easy to extract logic information that is explicitly contained in the target explanation. Hence we assume that there exists a latent space of logic information while generating explanations. Specifically, we propose a generative model called Variational Explanation Generator (VariationalEG) with a latent variable to model this space. Training with the guide of explicit logic information in target explanations, latent variable in VariationalEG could capture the implicit logic information in premise-hypothesis pairs effectively. Additionally, to tackle the problem of posterior collapse while training VariaztionalEG, we propose a simple yet effective approach called Logic Supervision on the latent variable to force it to encode logic information. Experiments on explanation generation benchmark—explanation-Stanford Natural Language Inference (e-SNLI) demonstrate that the proposed VariationalEG achieves significant improvement compared to previous studies and yields a state-of-the-art result. Furthermore, we perform the analysis of generated explanations to demonstrate the effect of the latent variable.

Keywords: Natural Language Inference, explanation generation, variational auto-encoder, generative model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 692
36 Molecular and Serological Diagnosis of Newcastle and Ornithobacterium rhinotracheale Broiler in Chicken in Fars Province, Iran

Authors: Mohammadjavad Mehrabanpour, Maryam Ranjbar Bushehri, Dorsa Mehrabanpour

Abstract:

Respiratory diseases are the most important problems in the country’s poultry industry, particularly when it comes to broiler flocks. Ornithobacterium rhinotracheale (ORT) is a species that causes poor performance in growth rate, egg production, and mortality. This pathogen causes a respiratory infection including pulmonary alveolar inflammation, and pneumonia of birds throughout the world. Newcastle disease (ND) is a highly contagious disease in poultry, and also, it causes considerable losses to the poultry industry. The aim of this study was to evaluate the simultaneous occurrence of ORT and ND and NDV isolation by inoculation in embryonated eggs and confirmed by RT-PCR in broiler chicken flocks in Fars province. In this study, 318 blood and 85 tissue samples (brain, trachea, liver, and cecal tonsils) were collected from 15 broiler chicken farms. Survey serum antibody titers against ORT by using a commercial enzyme-linked immunosorbent assay (ELISA) kit performed. Evaluation of antibody titer against ND virus is performed by hemagglutination inhibition test. Virus isolation with chick embryo eggs 9-11 and RT-PCR method were carried out. A total of 318 serum samples, 135 samples (42.5%) were positive for antibodies to ORT and titer of HI antibodies against NDV in 122 serum samples (38/4%) were 7-10 (log2) and 61 serum samples (19/2%) had occurrence antibody titer against Newcastle virus and ORT. Results of the present study indicated that 20 tissue samples were positive in embryonated egg and in rapid hemagglutination (HA) test. HI test with specific ND positive serum confirmed that 6 of 20 samples. PCR confirmed that all six samples were positive and PCR products of samples indicated 535-base pair fragments in electrophrosis. Due to the great economic importance of these two diseases in the poultry industry, it is necessary to design and implement a comprehensive plan for prevention and control of these diseases.

Keywords: ELISA, Newcastle disease, Ornithobacterium rhinotracheale, seroprevalence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
35 Portfolio Management for Construction Company during Covid-19 Using AHP Technique

Authors: Sareh Rajabi, Salwa Bheiry

Abstract:

In general, Covid-19 created many financial and non-financial damages to the economy and community. Level and severity of covid-19 as pandemic case varies over the region and due to different types of the projects. Covid-19 virus emerged as one of the most imperative risk management factors word-wide recently. Therefore, as part of portfolio management assessment, it is essential to evaluate severity of such risk on the project and program in portfolio management level to avoid any risky portfolio. Covid-19 appeared very effectively in South America, part of Europe and Middle East. Such pandemic infection affected the whole universe, due to lock down, interruption in supply chain management, health and safety requirements, transportations and commercial impacts. Therefore, this research proposes Analytical Hierarchy Process (AHP) to analyze and assess such pandemic case like Covid-19 and its impacts on the construction projects. The AHP technique uses four sub-criteria: Health and safety, commercial risk, completion risk and contractual risk to evaluate the project and program. The result will provide the decision makers with information which project has higher or lower risk in case of Covid-19 and pandemic scenario. Therefore, the decision makers can have most feasible solution based on effective weighted criteria for project selection within their portfolio to match with the organization’s strategies.

Keywords: Portfolio management, risk management, COVID-19, analytical hierarchy process technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 832
34 Information Retrieval: A Comparative Study of Textual Indexing Using an Oriented Object Database (db4o) and the Inverted File

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.

Keywords: Information Retrieval, indexation, oriented object database (db4o), inverted file.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
33 Interpreting Chopin’s Music Today: Mythologization of Art: Kitsch

Authors: Ilona Bala

Abstract:

The subject of this abstract is related to the notion of 'popular music', a notion that should be treated with extreme care, particularly when applied to Frederic Chopin, one of the greatest composers of Romanticism. By ‘popular music’, we mean a category of everyday music, set against the more intellectual kind, referred to as ‘classical’. We only need to look back to the culture of the nineteenth century to realize that this ‘popular music’ refers to the ‘music of the low’. It can be studied from a sociological viewpoint, or as sociological aesthetics. However, we cannot ignore the fact that, very quickly, this music spread to the wealthiest strata of the European society of the nineteenth century, while likewise the lowest classes often listen to the intellectual classical music, so pleasant to listen to. Further, we can observe that a sort of ‘sacralisation of kitsch’ occurs at the intersection between the classical and popular music. This process is the topic of this contribution. We will start by investigating the notion of kitsch through the study of Chopin’s popular compositions. However, before considering the popularisation of this music in today’s culture, we will have to focus on the use of the word kitsch in Chopin’s times, through his own musical aesthetics. Finally, the objective here will be to negate the theory that art is simply the intellectual definition of aesthetics. A kitsch can, obviously, only work on the emotivity of the masses, as it represents one of the features of culture-language (the words which the masses identify with). All art is transformed, becoming something outdated or even outmoded. Here, we are truly within a process of mythologization of art, through the study of the aesthetic reception of the musical work.

Keywords: F. Chopin, musical work, popular music, romantic music, mythologization of art, kitsch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1272
32 Matrix Based Synthesis of EXOR dominated Combinational Logic for Low Power

Authors: Padmanabhan Balasubramanian, C. Hari Narayanan

Abstract:

This paper discusses a new, systematic approach to the synthesis of a NP-hard class of non-regenerative Boolean networks, described by FON[FOFF]={mi}[{Mi}], where for every mj[Mj]∈{mi}[{Mi}], there exists another mk[Mk]∈{mi}[{Mi}], such that their Hamming distance HD(mj, mk)=HD(Mj, Mk)=O(n), (where 'n' represents the number of distinct primary inputs). The method automatically ensures exact minimization for certain important selfdual functions with 2n-1 points in its one-set. The elements meant for grouping are determined from a newly proposed weighted incidence matrix. Then the binary value corresponding to the candidate pair is correlated with the proposed binary value matrix to enable direct synthesis. We recommend algebraic factorization operations as a post processing step to enable reduction in literal count. The algorithm can be implemented in any high level language and achieves best cost optimization for the problem dealt with, irrespective of the number of inputs. For other cases, the method is iterated to subsequently reduce it to a problem of O(n-1), O(n-2),.... and then solved. In addition, it leads to optimal results for problems exhibiting higher degree of adjacency, with a different interpretation of the heuristic, and the results are comparable with other methods. In terms of literal cost, at the technology independent stage, the circuits synthesized using our algorithm enabled net savings over AOI (AND-OR-Invert) logic, AND-EXOR logic (EXOR Sum-of- Products or ESOP forms) and AND-OR-EXOR logic by 45.57%, 41.78% and 41.78% respectively for the various problems. Circuit level simulations were performed for a wide variety of case studies at 3.3V and 2.5V supply to validate the performance of the proposed method and the quality of the resulting synthesized circuits at two different voltage corners. Power estimation was carried out for a 0.35micron TSMC CMOS process technology. In comparison with AOI logic, the proposed method enabled mean savings in power by 42.46%. With respect to AND-EXOR logic, the proposed method yielded power savings to the tune of 31.88%, while in comparison with AND-OR-EXOR level networks; average power savings of 33.23% was obtained.

Keywords: AOI logic, ESOP, AND-OR-EXOR, Incidencematrix, Hamming distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
31 Islamic Education System: Implementation of Curriculum Kuttab Al-Fatih Semarang

Authors: Basyir Yaman, Fades Br. Gultom

Abstract:

The picture and pattern of Islamic education in the Prophet's period in Mecca and Medina is the history of the past that we need to bring back. The Basic Education Institute called Kuttab. Kuttab or Maktab comes from the word kataba which means to write. The popular Kuttab in the Prophet’s period aims to resolve the illiteracy in the Arab community. In Indonesia, this Institution has 25 branches; one of them is located in Semarang (i.e. Kuttab Al-Fatih). Kuttab Al-Fatih as a non-formal institution of Islamic education is reserved for children aged 5-12 years. The independently designed curriculum is a distinctive feature that distinguishes between Kuttab Al-Fatih curriculum and the formal institutional curriculum in Indonesia. The curriculum includes the faith and the Qur’an. Kuttab Al-Fatih has been licensed as a Community Activity Learning Center under the direct supervision and guidance of the National Education Department. Here, we focus to describe the implementation of curriculum Kuttab Al-Fatih Semarang (i.e. faith and al-Qur’an). After that, we determine the relevance between the implementation of the Kuttab Al-Fatih education system with the formal education system in Indonesia. This research uses literature review and field research qualitative methods. We obtained the data from the head of Kuttab Al-Fatih Semarang, vice curriculum, faith coordinator, al-Qur’an coordinator, as well as the guardians of learners and the learners. The result of this research is the relevance of education system in Kuttab Al-Fatih Semarang about education system in Indonesia. Kuttab Al-Fatih Semarang emphasizes character building through a curriculum designed in such a way and combines thematic learning models in modules.

Keywords: Islamic education system, implementation of curriculum, Kuttab Al-Fatih semarang, formal education system in Indonesia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1302
30 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: Convolutional neural networks, coffee bean, peaberry, sorting, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
29 The Use of Information and Communication Technologies in Electoral Procedures: Comments on Electronic Voting Security

Authors: Magdalena Musiał-Karg

Abstract:

The expansion of telecommunication and progress of electronic media constitute important elements of our times. The recent worldwide convergence of information and communication technologies (ICT) and dynamic development of the mass media is leading to noticeable changes in the functioning of contemporary states and societies. Currently, modern technologies play more and more important roles and filter down to almost every field of contemporary human life. It results in the growth of online interactions that can be observed by the inconceivable increase in the number of people with home PCs and Internet access. The proof of it is undoubtedly the emergence and use of concepts such as e-society, e-banking, e-services, e-government, e-government, e-participation and e-democracy. The newly coined word e-democracy evidences that modern technologies have also been widely used in politics. Without any doubt in most countries all actors of political market (politicians, political parties, servants in political/public sector, media) use modern forms of communication with the society. Most of these modern technologies progress the processes of getting and sending information to the citizens, communication with the electorate, and also – which seems to be the biggest advantage – electoral procedures. Thanks to implementation of ICT the interaction between politicians and electorate are improved. The main goal of this text is to analyze electronic voting (e-voting) as one of the important forms of electronic democracy in terms of security aspects. The author of this paper aimed at answering the questions of security of electronic voting as an additional form of participation in elections and referenda.

Keywords: Electronic democracy, electronic participation, electronic voting, security of e-voting, ICT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1068
28 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1203
27 Incorporating Lexical-Semantic Knowledge into Convolutional Neural Network Framework for Pediatric Disease Diagnosis

Authors: Xiaocong Liu, Huazhen Wang, Ting He, Xiaozheng Li, Weihan Zhang, Jian Chen

Abstract:

The utilization of electronic medical record (EMR) data to establish the disease diagnosis model has become an important research content of biomedical informatics. Deep learning can automatically extract features from the massive data, which brings about breakthroughs in the study of EMR data. The challenge is that deep learning lacks semantic knowledge, which leads to impracticability in medical science. This research proposes a method of incorporating lexical-semantic knowledge from abundant entities into a convolutional neural network (CNN) framework for pediatric disease diagnosis. Firstly, medical terms are vectorized into Lexical Semantic Vectors (LSV), which are concatenated with the embedded word vectors of word2vec to enrich the feature representation. Secondly, the semantic distribution of medical terms serves as Semantic Decision Guide (SDG) for the optimization of deep learning models. The study evaluates the performance of LSV-SDG-CNN model on four kinds of Chinese EMR datasets. Additionally, CNN, LSV-CNN, and SDG-CNN are designed as baseline models for comparison. The experimental results show that LSV-SDG-CNN model outperforms baseline models on four kinds of Chinese EMR datasets. The best configuration of the model yielded an F1 score of 86.20%. The results clearly demonstrate that CNN has been effectively guided and optimized by lexical-semantic knowledge, and LSV-SDG-CNN model improves the disease classification accuracy with a clear margin.

Keywords: lexical semantics, feature representation, semantic decision, convolutional neural network, electronic medical record

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594
26 Value of Sharing: Viral Advertisement

Authors: Duygu Aydın, Aşina Gülerarslan, Süleyman Karaçor, Tarık Doğan

Abstract:

Sharing motivations of viral advertisements by consumers and the impacts of these advertisements on the perceptions for brand will be questioned in this study. Three fundamental questions are answered in the study. These are advertisement watching and sharing motivations of individuals, criteria of liking viral advertisement and the impact of individual attitudes for viral advertisement on brand perception respectively. This study will be carried out via a viral advertisement which was practiced in Turkey. The data will be collected by survey method and the sample of the study consists of individuals who experienced the practice of sample advertisement. Data will be collected by online survey method and will be analyzed by using SPSS statistical package program. Recently traditional advertisement mind have been changing. New advertising approaches which have significant impacts on consumers have been argued. Viral advertising is a modernist advertisement mind which offers significant advantages to brands apart from traditional advertising channels such as television, radio and magazines. Viral advertising also known as Electronic Word-of- Mouth (eWOM) consists of free spread of convincing messages sent by brands among interpersonal communication. When compared to the traditional advertising, a more provocative thematic approach is argued. The foundation of this approach is to create advertisements that are worth sharing with others by consumers. When that fact is taken into consideration, in a manner of speaking it can also be stated that viral advertising is media engineering. The content worth sharing makes people being a volunteer spokesman of a brand and strengthens the emotional bonds among brand and consumer. Especially for some sectors in countries which are having traditional advertising channel limitations, viral advertising creates vital advantages.

Keywords: Viral advertising, marketing, consumers, brands.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2152
25 Very Large Scale Integration Architecture of Finite Impulse Response Filter Implementation Using Retiming Technique

Authors: S. Jalaja, A. M. Vijaya Prakash

Abstract:

Recursive combination of an algorithm based on Karatsuba multiplication is exploited to design a generalized transpose and parallel Finite Impulse Response (FIR) Filter. Mid-range Karatsuba multiplication and Carry Save adder based on Karatsuba multiplication reduce time complexity for higher order multiplication implemented up to n-bit. As a result, we design modified N-tap Transpose and Parallel Symmetric FIR Filter Structure using Karatsuba algorithm. The mathematical formulation of the FFA Filter is derived. The proposed architecture involves significantly less area delay product (APD) then the existing block implementation. By adopting retiming technique, hardware cost is reduced further. The filter architecture is designed by using 90 nm technology library and is implemented by using cadence EDA Tool. The synthesized result shows better performance for different word length and block size. The design achieves switching activity reduction and low power consumption by applying with and without retiming for different combination of the circuit. The proposed structure achieves more than a half of the power reduction by adopting with and without retiming techniques compared to the earlier design structure. As a proof of the concept for block size 16 and filter length 64 for CKA method, it achieves a 51% as well as 70% less power by applying retiming technique, and for CSA method it achieves a 57% as well as 77% less power by applying retiming technique compared to the previously proposed design.

Keywords: Carry save adder Karatsuba multiplication, mid-range Karatsuba multiplication, modified FFA, transposed filter, retiming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 910