Search results for: accurate tagging algorithm

5614 Part of Speech Tagging Using Statistical Approach for Nepali Text

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 298

5613 A Tagging Algorithm in Augmented Reality for Mobile Device Screens

Authors: Doga Erisik, Ahmet Karaman, Gulfem Alptekin, Ozlem Durmaz Incel

Abstract:

Augmented reality (AR) is a type of virtual reality aiming to duplicate real world’s environment on a computer’s video feed. The mobile application, which is built for this project (called SARAS), enables annotating real world point of interests (POIs) that are located near mobile user. In this paper, we aim at introducing a robust and simple algorithm for placing labels in an augmented reality system. The system places labels of the POIs on the mobile device screen whose GPS coordinates are given. The proposed algorithm is compared to an existing one in terms of energy consumption and accuracy. The results show that the proposed algorithm gives better results in energy consumption and accuracy while standing still, and acceptably accurate results when driving. The technique provides benefits to AR browsers with its open access algorithm. Going forward, the algorithm will be improved to more rapidly react to position changes while driving.

Keywords: accurate tagging algorithm, augmented reality, localization, location-based AR

Procedia PDF Downloads 335

5612 Accurate Algorithm for Selecting Ground Motions Satisfying Code Criteria

Authors: S. J. Ha, S. J. Baik, T. O. Kim, S. W. Han

Abstract:

For computing the seismic responses of structures, current seismic design provisions permit response history analyses (RHA) that can be used without limitations in height, seismic design category, and building irregularity. In order to obtain accurate seismic responses using RHA, it is important to use adequate input ground motions. Current seismic design provisions provide criteria for selecting ground motions. In this study, the accurate and computationally efficient algorithm is proposed for accurately selecting ground motions that satisfy the requirements specified in current seismic design provisions. The accuracy of the proposed algorithm is verified using single-degree-of-freedom systems with various natural periods and yield strengths. This study shows that the mean seismic responses obtained from RHA with seven and ten ground motions selected using the proposed algorithm produce errors within 20% and 13%, respectively.

Keywords: algorithm, ground motion, response history analysis, selection

Procedia PDF Downloads 258

5611 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an

Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett

Abstract:

Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.

Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing

Procedia PDF Downloads 280

5610 Touch Interaction through Tagging Context

Authors: Gabriel Chavira, Jorge Orozco, Salvador Nava, Eduardo Álvarez, Julio Rolón, Roberto Pichardo

Abstract:

Ambient Intelligence promotes a shift in computing which involves fitting-out the environments with devices to support context-aware applications. One of main objectives is the reduction to a minimum of the user’s interactive effort, the diversity and quantity of devices with which people are surrounded with, in existing environments; increase the level of difficulty to achieve this goal. The mobile phones and their amazing global penetration, makes it an excellent device for delivering new services to the user, without requiring a learning effort. The environment will have to be able to perceive all of the interaction techniques. In this paper, we present the PICTAC model (Perceiving touch Interaction through TAgging Context), which similarly delivers service to members of a research group.

Keywords: ambient intelligence, tagging context, touch interaction, touching services

Procedia PDF Downloads 347

5609 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Faiza Hussain, Usman Qamar

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: natural language processing, text mining, information retrieval, parts-of-speech tagging, grammar, semantics

Procedia PDF Downloads 274

5608 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm

Authors: Essam Al Daoud

Abstract:

This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.

Keywords: least square, neighbor joining, phylogenetic tree, wild dog pack

Procedia PDF Downloads 286

5607 3D Reconstruction of Human Body Based on Gender Classification

Authors: Jiahe Liu, Hongyang Yu, Feng Qian, Miao Luo

Abstract:

SMPL-X was a powerful parametric human body model that included male, neutral, and female models, with significant gender differences between these three models. During the process of 3D human body reconstruction, the correct selection of standard templates was crucial for obtaining accurate results. To address this issue, we developed an efficient gender classification algorithm to automatically select the appropriate template for 3D human body reconstruction. The key to this gender classification algorithm was the precise analysis of human body features. By using the SMPL-X model, the algorithm could detect and identify gender features of the human body, thereby determining which standard template should be used. The accuracy of this algorithm made the 3D reconstruction process more accurate and reliable, as it could adjust model parameters based on individual gender differences. SMPL-X and the related gender classification algorithm have brought important advancements to the field of 3D human body reconstruction. By accurately selecting standard templates, they have improved the accuracy of reconstruction and have broad potential in various application fields. These technologies continue to drive the development of the 3D reconstruction field, providing us with more realistic and accurate human body models.

Keywords: gender classification, joint detection, SMPL-X, 3D reconstruction

Procedia PDF Downloads 28

5606 Gene Prediction in DNA Sequences Using an Ensemble Algorithm Based on Goertzel Algorithm and Anti-Notch Filter

Authors: Hamidreza Saberkari, Mousa Shamsi, Hossein Ahmadi, Saeed Vaali, , MohammadHossein Sedaaghi

Abstract:

In the recent years, using signal processing tools for accurate identification of the protein coding regions has become a challenge in bioinformatics. Most of the genomic signal processing methods is based on the period-3 characteristics of the nucleoids in DNA strands and consequently, spectral analysis is applied to the numerical sequences of DNA to find the location of periodical components. In this paper, a novel ensemble algorithm for gene selection in DNA sequences has been presented which is based on the combination of Goertzel algorithm and anti-notch filter (ANF). The proposed algorithm has many advantages when compared to other conventional methods. Firstly, it leads to identify the coding protein regions more accurate due to using the Goertzel algorithm which is tuned at the desired frequency. Secondly, faster detection time is achieved. The proposed algorithm is applied on several genes, including genes available in databases BG570 and HMR195 and their results are compared to other methods based on the nucleotide level evaluation criteria. Implementation results show the excellent performance of the proposed algorithm in identifying protein coding regions, specifically in identification of small-scale gene areas.

Keywords: protein coding regions, period-3, anti-notch filter, Goertzel algorithm

Procedia PDF Downloads 357

5605 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 405

5604 Control Algorithm for Home Automation Systems

Authors: Marek Długosz, Paweł Skruch

Abstract:

One of purposes of home automation systems is to provide appropriate comfort to the users by suitable air temperature control and stabilization inside the rooms. The control of temperature level is not a simple task and the basic difficulty results from the fact that accurate parameters of the object of control, that is a building, remain unknown. Whereas the structure of the model is known, the identification of model parameters is a difficult task. In this paper, a control algorithm allowing the present temperature to be reached inside the building within the specified time without the need to know accurate parameters of the building itself is presented.

Keywords: control, home automation system, wireless networking, automation engineering

Procedia PDF Downloads 575

5603 3D Human Body Reconstruction Based on Multiple Viewpoints

Authors: Jiahe Liu, HongyangYu, Feng Qian, Miao Luo

Abstract:

The aim of this study was to improve the effects of human body 3D reconstruction. The MvP algorithm was adopted to obtain key point information from multiple perspectives. This algorithm allowed the capture of human posture and joint positions from multiple angles, providing more comprehensive and accurate data. The study also incorporated the SMPL-X model, which has been widely used for human body modeling, to achieve more accurate 3D reconstruction results. The use of the MvP algorithm made it possible to observe the reconstructed object from multiple angles, thus reducing the problems of blind spots and missing information. This algorithm was able to effectively capture key point information, including the position and rotation angle of limbs, providing key data for subsequent 3D reconstruction. Compared with traditional single-view methods, the method of multi-view fusion significantly improved the accuracy and stability of reconstruction. By combining the MvP algorithm with the SMPL-X model, we successfully achieved better human body 3D reconstruction effects. The SMPL-X model is highly scalable and can generate highly realistic 3D human body models, thus providing more detail and shape information.

Keywords: 3D human reconstruction, multi-view, joint point, SMPL-X

Procedia PDF Downloads 28

5602 A Fast Silhouette Detection Algorithm for Shadow Volumes in Augmented Reality

Authors: Hoshang Kolivand, Mahyar Kolivand, Mohd Shahrizal Sunar, Mohd Azhar M. Arsad

Abstract:

Real-time shadow generation in virtual environments and Augmented Reality (AR) was always a hot topic in the last three decades. Lots of calculation for shadow generation among AR needs a fast algorithm to overcome this issue and to be capable of implementing in any real-time rendering. In this paper, a silhouette detection algorithm is presented to generate shadows for AR systems. Δ+ algorithm is presented based on extending edges of occluders to recognize which edges are silhouettes in the case of real-time rendering. An accurate comparison between the proposed algorithm and current algorithms in silhouette detection is done to show the reduction calculation by presented algorithm. The algorithm is tested in both virtual environments and AR systems. We think that this algorithm has the potential to be a fundamental algorithm for shadow generation in all complex environments.

Keywords: silhouette detection, shadow volumes, real-time shadows, rendering, augmented reality

Procedia PDF Downloads 407

5601 Towards a Framework for Embedded Weight Comparison Algorithm with Business Intelligence in the Plantation Domain

Authors: M. Pushparani, A. Sagaya

Abstract:

Embedded systems have emerged as important elements in various domains with extensive applications in automotive, commercial, consumer, healthcare and transportation markets, as there is emphasis on intelligent devices. On the other hand, Business Intelligence (BI) has also been extensively used in a range of applications, especially in the agriculture domain which is the area of this research. The aim of this research is to create a framework for Embedded Weight Comparison Algorithm with Business Intelligence (EWCA-BI). The weight comparison algorithm will be embedded within the plantation management system and the weighbridge system. This algorithm will be used to estimate the weight at the site and will be compared with the actual weight at the plantation. The algorithm will be used to build the necessary alerts when there is a discrepancy in the weight, thus enabling better decision making. In the current practice, data are collected from various locations in various forms. It is a challenge to consolidate data to obtain timely and accurate information for effective decision making. Adding to this, the unstable network connection leads to difficulty in getting timely accurate information. To overcome the challenges embedding is done on a portable device that will have the embedded weight comparison algorithm to also assist in data capture and synchronize data at various locations overcoming the network short comings at collection points. The EWCA-BI will provide real-time information at any given point of time, thus enabling non-latent BI reports that will provide crucial information to enable efficient operational decision making. This research has a high potential in bringing embedded system into the agriculture industry. EWCA-BI will provide BI reports with accurate information with uncompromised data using an embedded system and provide alerts, therefore, enabling effective operation management decision-making at the site.

Keywords: embedded business intelligence, weight comparison algorithm, oil palm plantation, embedded systems

Procedia PDF Downloads 248

5600 Developing an Accurate AI Algorithm for Histopathologic Cancer Detection

Authors: Leah Ning

Abstract:

This paper discusses the development of a machine learning algorithm that accurately detects metastatic breast cancer (cancer has spread elsewhere from its origin part) in selected images that come from pathology scans of lymph node sections. Being able to develop an accurate artificial intelligence (AI) algorithm would help significantly in breast cancer diagnosis since manual examination of lymph node scans is both tedious and oftentimes highly subjective. The usage of AI in the diagnosis process provides a much more straightforward, reliable, and efficient method for medical professionals and would enable faster diagnosis and, therefore, more immediate treatment. The overall approach used was to train a convolution neural network (CNN) based on a set of pathology scan data and use the trained model to binarily classify if a new scan were benign or malignant, outputting a 0 or a 1, respectively. The final model’s prediction accuracy is very high, with 100% for the train set and over 70% for the test set. Being able to have such high accuracy using an AI model is monumental in regard to medical pathology and cancer detection. Having AI as a new tool capable of quick detection will significantly help medical professionals and patients suffering from cancer.

Keywords: breast cancer detection, AI, machine learning, algorithm

Procedia PDF Downloads 56

5599 Automatic Tagging and Accuracy in Assamese Text Data

Authors: Chayanika Hazarika Bordoloi

Abstract:

This paper is an attempt to work on a highly inflectional language called Assamese. This is also one of the national languages of India and very little has been achieved in terms of computational research. Building a language processing tool for a natural language is not very smooth as the standard and language representation change at various levels. This paper presents inflectional suffixes of Assamese verbs and how the statistical tools, along with linguistic features, can improve the tagging accuracy. Conditional random fields (CRF tool) was used to automatically tag and train the text data; however, accuracy was improved after linguistic featured were fed into the training data. Assamese is a highly inflectional language; hence, it is challenging to standardizing its morphology. Inflectional suffixes are used as a feature of the text data. In order to analyze the inflections of Assamese word forms, a list of suffixes is prepared. This list comprises suffixes, comprising of all possible suffixes that various categories can take is prepared. Assamese words can be classified into inflected classes (noun, pronoun, adjective and verb) and un-inflected classes (adverb and particle). The corpus used for this morphological analysis has huge tokens. The corpus is a mixed corpus and it has given satisfactory accuracy. The accuracy rate of the tagger has gradually improved with the modified training data.

Keywords: CRF, morphology, tagging, tagset

Procedia PDF Downloads 164

5598 Software Architecture Optimization Using Swarm Intelligence Techniques

Authors: Arslan Ellahi, Syed Amjad Hussain, Fawaz Saleem Bokhari

Abstract:

Optimization of software architecture can be done with respect to a quality attributes (QA). In this paper, there is an analysis of multiple research papers from different dimensions that have been used to classify those attributes. We have proposed a technique of swarm intelligence Meta heuristic ant colony optimization algorithm as a contribution to solve this critical optimization problem of software architecture. We have ranked quality attributes and run our algorithm on every QA, and then we will rank those on the basis of accuracy. At the end, we have selected the most accurate quality attributes. Ant colony algorithm is an effective algorithm and will perform best in optimizing the QA’s and ranking them.

Keywords: complexity, rapid evolution, swarm intelligence, dimensions

Procedia PDF Downloads 223

5597 DOA Estimation Using Golden Section Search

Authors: Niharika Verma, Sandeep Santosh

Abstract:

DOA technique is a localization technique used in the communication field. Various algorithms have been developed for direction of arrival estimation like MUSIC, ROOT MUSIC, etc. These algorithms depend on various parameters like antenna array elements, number of snapshots and various others. Basically the MUSIC spectrum is evaluated and peaks obtained are considered as the angle of arrivals. The angles evaluated using this process depends on the scanning interval chosen. The accuracy of the results obtained depends on the coarseness of the interval chosen. In this paper, golden section search is applied to the MUSIC algorithm and therefore, more accurate results are achieved. Initially the coarse DOA estimations is done using the MUSIC algorithm in the range -90 to 90 degree at the interval of 10 degree. After the peaks obtained then fine DOA estimation is done using golden section search. Also, the partitioning method is applied to estimate the number of signals incident on the antenna array. Dependency of the algorithm on the number of snapshots is also being explained. Hence, the accurate results are being determined using this algorithm.

Keywords: Direction of Arrival (DOA), golden section search, MUSIC, number of snapshots

Procedia PDF Downloads 415

5596 Building Scalable and Accurate Hybrid Kernel Mapping Recommender

Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak

Abstract:

Recommender systems uses artificial intelligence practices for ﬁltering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.

Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail

Procedia PDF Downloads 300

5595 Integrated Target Tracking and Control for Automated Car-Following of Truck Platforms

Authors: Fadwa Alaskar, Fang-Chieh Chou, Carlos Flores, Xiao-Yun Lu, Alexandre M. Bayen

Abstract:

This article proposes a perception model for enhancing the accuracy and stability of car-following control of a longitudinally automated truck. We applied a fusion-based tracking algorithm on measurements of a single preceding vehicle needed for car-following control. This algorithm fuses two types of data, radar and LiDAR data, to obtain more accurate and robust longitudinal perception of the subject vehicle in various weather conditions. The filter’s resulting signals are fed to the gap control algorithm at every tracking loop composed by a high-level gap control and lower acceleration tracking system. Several highway tests have been performed with two trucks. The tests show accurate and fast tracking of the target, which impacts on the gap control loop positively. The experiments also show the fulfilment of control design requirements, such as fast speed variations tracking and robust time gap following.

Keywords: object tracking, perception, sensor fusion, adaptive cruise control, cooperative adaptive cruise control

Procedia PDF Downloads 194

5594 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech

Procedia PDF Downloads 258

5593 Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries

Authors: Elham Alaee, Mousa Shamsi, Hossein Ahmadi, Soroosh Nazem, Mohammad Hossein Sedaaghi

Abstract:

Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy C-Means (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic C-Means (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.

Keywords: facial image, segmentation, PCM, FCM, skin error, facial surgery

Procedia PDF Downloads 545

5592 Particle Filter State Estimation Algorithm Based on Improved Artificial Bee Colony Algorithm

Authors: Guangyuan Zhao, Nan Huang, Xuesong Han, Xu Huang

Abstract:

In order to solve the problem of sample dilution in the traditional particle filter algorithm and achieve accurate state estimation in a nonlinear system, a particle filter method based on an improved artificial bee colony (ABC) algorithm was proposed. The algorithm simulated the process of bee foraging and optimization and made the high likelihood region of the backward probability of particles moving to improve the rationality of particle distribution. The opposition-based learning (OBL) strategy is introduced to optimize the initial population of the artificial bee colony algorithm. The convergence factor is introduced into the neighborhood search strategy to limit the search range and improve the convergence speed. Finally, the crossover and mutation operations of the genetic algorithm are introduced into the search mechanism of the following bee, which makes the algorithm jump out of the local extreme value quickly and continue to search the global extreme value to improve its optimization ability. The simulation results show that the improved method can improve the estimation accuracy of particle filters, ensure the diversity of particles, and improve the rationality of particle distribution.

Keywords: particle filter, impoverishment, state estimation, artificial bee colony algorithm

Procedia PDF Downloads 92

5591 Forward Stable Computation of Roots of Real Polynomials with Only Real Distinct Roots

Authors: Nevena Jakovčević Stor, Ivan Slapničar

Abstract:

Any polynomial can be expressed as a characteristic polynomial of a complex symmetric arrowhead matrix. This expression is not unique. If the polynomial is real with only real distinct roots, the matrix can be chosen as real. By using accurate forward stable algorithm for computing eigen values of real symmetric arrowhead matrices we derive a forward stable algorithm for computation of roots of such polynomials in O(n^2 ) operations. The algorithm computes each root to almost full accuracy. In some cases, the algorithm invokes extended precision routines, but only in the non-iterative part. Our examples include numerically difficult problems, like the well-known Wilkinson’s polynomials. Our algorithm compares favorably to other method for polynomial root-finding, like MPSolve or Newton’s method.

Keywords: roots of polynomials, eigenvalue decomposition, arrowhead matrix, high relative accuracy

Procedia PDF Downloads 377

5590 The Automatisation of Dictionary-Based Annotation in a Parallel Corpus of Old English

Authors: Ana Elvira Ojanguren Lopez, Javier Martin Arista

Abstract:

The aims of this paper are to present the automatisation procedure adopted in the implementation of a parallel corpus of Old English, as well as, to assess the progress of automatisation with respect to tagging, annotation, and lemmatisation. The corpus consists of an aligned parallel text with word-for-word comparison Old English-English that provides the Old English segment with inflectional form tagging (gloss, lemma, category, and inflection) and lemma annotation (spelling, meaning, inflectional class, paradigm, word-formation and secondary sources). This parallel corpus is intended to fill a gap in the field of Old English, in which no parallel and/or lemmatised corpora are available, while the average amount of corpus annotation is low. With this background, this presentation has two main parts. The first part, which focuses on tagging and annotation, selects the layouts and fields of lexical databases that are relevant for these tasks. Most information used for the annotation of the corpus can be retrieved from the lexical and morphological database Nerthus and the database of secondary sources Freya. These are the sources of linguistic and metalinguistic information that will be used for the annotation of the lemmas of the corpus, including morphological and semantic aspects as well as the references to the secondary sources that deal with the lemmas in question. Although substantially adapted and re-interpreted, the lemmatised part of these databases draws on the standard dictionaries of Old English, including The Student's Dictionary of Anglo-Saxon, An Anglo-Saxon Dictionary, and A Concise Anglo-Saxon Dictionary. The second part of this paper deals with lemmatisation. It presents the lemmatiser Norna, which has been implemented on Filemaker software. It is based on a concordance and an index to the Dictionary of Old English Corpus, which comprises around three thousand texts and three million words. In its present state, the lemmatiser Norna can assign lemma to around 80% of textual forms on an automatic basis, by searching the index and the concordance for prefixes, stems and inflectional endings. The conclusions of this presentation insist on the limits of the automatisation of dictionary-based annotation in a parallel corpus. While the tagging and annotation are largely automatic even at the present stage, the automatisation of alignment is pending for future research. Lemmatisation and morphological tagging are expected to be fully automatic in the near future, once the database of secondary sources Freya and the lemmatiser Norna have been completed.

Keywords: corpus linguistics, historical linguistics, old English, parallel corpus

Procedia PDF Downloads 166

5589 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: social networking, information extraction, part-of-speech tagging, natural language processing

Procedia PDF Downloads 270

5588 Tagging a corpus of Media Interviews with Diplomats: Challenges and Solutions

Authors: Roberta Facchinetti, Sara Corrizzato, Silvia Cavalieri

Abstract:

Increasing interconnection between data digitalization and linguistic investigation has given rise to unprecedented potentialities and challenges for corpus linguists, who need to master IT tools for data analysis and text processing, as well as to develop techniques for efficient and reliable annotation in specific mark-up languages that encode documents in a format that is both human and machine-readable. In the present paper, the challenges emerging from the compilation of a linguistic corpus will be taken into consideration, focusing on the English language in particular. To do so, the case study of the InterDiplo corpus will be illustrated. The corpus, currently under development at the University of Verona (Italy), represents a novelty in terms both of the data included and of the tag set used for its annotation. The corpus covers media interviews and debates with diplomats and international operators conversing in English with journalists who do not share the same lingua-cultural background as their interviewees. To date, this appears to be the first tagged corpus of international institutional spoken discourse and will be an important database not only for linguists interested in corpus analysis but also for experts operating in international relations. In the present paper, special attention will be dedicated to the structural mark-up, parts of speech annotation, and tagging of discursive traits, that are the innovational parts of the project being the result of a thorough study to find the best solution to suit the analytical needs of the data. Several aspects will be addressed, with special attention to the tagging of the speakers’ identity, the communicative events, and anthropophagic. Prominence will be given to the annotation of question/answer exchanges to investigate the interlocutors’ choices and how such choices impact communication. Indeed, the automated identification of questions, in relation to the expected answers, is functional to understand how interviewers elicit information as well as how interviewees provide their answers to fulfill their respective communicative aims. A detailed description of the aforementioned elements will be given using the InterDiplo-Covid19 pilot corpus. The data yielded by our preliminary analysis of the data will highlight the viable solutions found in the construction of the corpus in terms of XML conversion, metadata definition, tagging system, and discursive-pragmatic annotation to be included via Oxygen.

Keywords: spoken corpus, diplomats’ interviews, tagging system, discursive-pragmatic annotation, english linguistics

Procedia PDF Downloads 151

5587 Co-Evolutionary Fruit Fly Optimization Algorithm and Firefly Algorithm for Solving Unconstrained Optimization Problems

Authors: R. M. Rizk-Allah

Abstract:

This paper presents co-evolutionary fruit fly optimization algorithm based on firefly algorithm (CFOA-FA) for solving unconstrained optimization problems. The proposed algorithm integrates the merits of fruit fly optimization algorithm (FOA), firefly algorithm (FA) and elite strategy to refine the performance of classical FOA. Moreover, co-evolutionary mechanism is performed by applying FA procedures to ensure the diversity of the swarm. Finally, the proposed algorithm CFOA- FA is tested on several benchmark problems from the usual literature and the numerical results have demonstrated the superiority of the proposed algorithm for finding the global optimal solution.

Keywords: firefly algorithm, fruit fly optimization algorithm, unconstrained optimization problems

Procedia PDF Downloads 498

5586 Graph Cuts Segmentation Approach Using a Patch-Based Similarity Measure Applied for Interactive CT Lung Image Segmentation

Authors: Aicha Majda, Abdelhamid El Hassani

Abstract:

Lung CT image segmentation is a prerequisite in lung CT image analysis. Most of the conventional methods need a post-processing to deal with the abnormal lung CT scans such as lung nodules or other lesions. The simplest similarity measure in the standard Graph Cuts Algorithm consists of directly comparing the pixel values of the two neighboring regions, which is not accurate because this kind of metrics is extremely sensitive to minor transformations such as noise or other artifacts problems. In this work, we propose an improved version of the standard graph cuts algorithm based on the Patch-Based similarity metric. The boundary penalty term in the graph cut algorithm is defined Based on Patch-Based similarity measurement instead of the simple intensity measurement in the standard method. The weights between each pixel and its neighboring pixels are Based on the obtained new term. The graph is then created using theses weights between its nodes. Finally, the segmentation is completed with the minimum cut/Max-Flow algorithm. Experimental results show that the proposed method is very accurate and efficient, and can directly provide explicit lung regions without any post-processing operations compared to the standard method.

Keywords: graph cuts, lung CT scan, lung parenchyma segmentation, patch-based similarity metric

Procedia PDF Downloads 136

5585 The Selection of the Nearest Anchor Using Received Signal Strength Indication (RSSI)

Authors: Hichem Sassi, Tawfik Najeh, Noureddine Liouane

Abstract:

The localization information is crucial for the operation of WSN. There are principally two types of localization algorithms. The Range-based localization algorithm has strict requirements on hardware; thus, it is expensive to be implemented in practice. The Range-free localization algorithm reduces the hardware cost. However, it can only achieve high accuracy in ideal scenarios. In this paper, we locate unknown nodes by incorporating the advantages of these two types of methods. The proposed algorithm makes the unknown nodes select the nearest anchor using the Received Signal Strength Indicator (RSSI) and choose two other anchors which are the most accurate to achieve the estimated location. Our algorithm improves the localization accuracy compared with previous algorithms, which has been demonstrated by the simulating results.

Keywords: WSN, localization, DV-Hop, RSSI

Procedia PDF Downloads 331