Search results for: document score
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 596

Search results for: document score

566 The Keys to Innovation: Defining and Evaluating Attributes that Measure Innovation Capabilities

Authors: Mohammad Samarah, Benjamin Stark, Jennifer Kindle, Langley Payton

Abstract:

Innovation is a key driver for companies, society, and economic growth. However, assessing and measuring innovation for individuals as well as organizations remains difficult. Our i5-Score presented in this study will help to overcome this difficulty and facilitate measuring the innovation potential. The score is based on a framework we call the 5Gs of innovation which defines specific innovation attributes. Those are 1) the drive for long-term goals 2) the audacity to generate new ideas, 3) the openness to share ideas with others, 4) the ability to grow, and 5) the ability to maintain high levels of optimism. To validate the i5-Score, we conducted a study at Florida Polytechnic University. The results show that the i5-Score is a good measure reflecting the innovative mindset of an individual or a group. Thus, the score can be utilized for evaluating, refining and enhancing innovation capabilities.

Keywords: Change management, innovation attributes, organizational development, STEM and venture creation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1024
565 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745
564 Restoration of Noisy Document Images with an Efficient Bi-Level Adaptive Thresholding

Authors: Abhijit Mitra

Abstract:

An effective approach for extracting document images from a noisy background is introduced. The entire scheme is divided into three sub- stechniques – the initial preprocessing operations for noise cluster tightening, introduction of a new thresholding method by maximizing the ratio of stan- dard deviations of the combined effect on the image to the sum of weighted classes and finally the image restoration phase by image binarization utiliz- ing the proposed optimum threshold level. The proposed method is found to be efficient compared to the existing schemes in terms of computational complexity as well as speed with better noise rejection.

Keywords: Document image extraction, Preprocessing, Ratio of stan-dard deviations, Bi-level adaptive thresholding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1457
563 Fast Document Segmentation Using Contourand X-Y Cut Technique

Authors: Boontee Kruatrachue, Narongchai Moongfangklang, Kritawan Siriboon

Abstract:

This paper describes fast and efficient method for page segmentation of document containing nonrectangular block. The segmentation is based on edge following algorithm using small window of 16 by 32 pixels. This segmentation is very fast since only border pixels of paragraph are used without scanning the whole page. Still, the segmentation may contain error if the space between them is smaller than the window used in edge following. Consequently, this paper reduce this error by first identify the missed segmentation point using direction information in edge following then, using X-Y cut at the missed segmentation point to separate the connected columns. The advantage of the proposed method is the fast identification of missed segmentation point. This methodology is faster with fewer overheads than other algorithms that need to access much more pixel of a document.

Keywords: Contour Direction Technique, Missed SegmentationPoints, Page Segmentation, Recursive X-Y Cut Technique

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2783
562 Addressing the Oracle Problem: Decentralized Authentication in Blockchain-Based Green Hydrogen Certification

Authors: Volker Wannack

Abstract:

The aim of this paper is to present a concept for addressing the Oracle Problem in the context of hydrogen production using renewable energy sources. The proposed approach relies on the authentication of the electricity used for hydrogen production by multiple surrounding actors with similar electricity generation facilities, which attest to the authenticity of the electricity production. The concept introduces an Authenticity Score assigned to each certificate, as well as a Trust Score assigned to each witness. Each certificate must be attested by different actors with a sufficient Trust Score to achieve an Authenticity Score above a predefined threshold, thereby demonstrating that the produced hydrogen is indeed "green."

Keywords: Hydrogen, blockchain, sustainability, structural change.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 72
561 Web Search Engine Based Naming Procedure for Independent Topic

Authors: Takahiro Nishigaki, Takashi Onoda

Abstract:

In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.

Keywords: Independent topic analysis, topic extraction, topic naming, web search engine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 500
560 Using Simulation Modeling Approach to Predict USMLE Steps 1 and 2 Performances

Authors: Chau-Kuang Chen, John Hughes, Jr., A. Dexter Samuels

Abstract:

The prediction models for the United States Medical Licensure Examination (USMLE) Steps 1 and 2 performances were constructed by the Monte Carlo simulation modeling approach via linear regression. The purpose of this study was to build robust simulation models to accurately identify the most important predictors and yield the valid range estimations of the Steps 1 and 2 scores. The application of simulation modeling approach was deemed an effective way in predicting student performances on licensure examinations. Also, sensitivity analysis (a/k/a what-if analysis) in the simulation models was used to predict the magnitudes of Steps 1 and 2 affected by changes in the National Board of Medical Examiners (NBME) Basic Science Subject Board scores. In addition, the study results indicated that the Medical College Admission Test (MCAT) Verbal Reasoning score and Step 1 score were significant predictors of the Step 2 performance. Hence, institutions could screen qualified student applicants for interviews and document the effectiveness of basic science education program based on the simulation results.

Keywords: Prediction Model, Sensitivity Analysis, Simulation Method, USMLE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1461
559 Key Based Text Watermarking of E-Text Documents in an Object Based Environment Using Z-Axis for Watermark Embedding

Authors: Mussarat Abdullah, Fazal Wahab

Abstract:

Data hiding into text documents itself involves pretty complexities due to the nature of text documents. A robust text watermarking scheme targeting an object based environment is presented in this research. The heart of the proposed solution describes the concept of watermarking an object based text document where each and every text string is entertained as a separate object having its own set of properties. Taking advantage of the z-ordering of objects watermark is applied with the z-axis letting zero fidelity disturbances to the text. Watermark sequence of bits generated against user key is hashed with selected properties of given document, to determine the bit sequence to embed. Bits are embedded along z-axis and the document has no fidelity issues when printed, scanned or photocopied.

Keywords: Digital Watermarking, Object Based Environment, Watermark, z-ordering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
558 Automatic Music Score Recognition System Using Digital Image Processing

Authors: Yuan-Hsiang Chang, Zhong-Xian Peng, Li-Der Jeng

Abstract:

Music has always been an integral part of human’s daily lives. But, for the most people, reading musical score and turning it into melody is not easy. This study aims to develop an Automatic music score recognition system using digital image processing, which can be used to read and analyze musical score images automatically. The technical approaches included: (1) staff region segmentation; (2) image preprocessing; (3) note recognition; and (4) accidental and rest recognition. Digital image processing techniques (e.g., horizontal /vertical projections, connected component labeling, morphological processing, template matching, etc.) were applied according to musical notes, accidents, and rests in staff notations. Preliminary results showed that our system could achieve detection and recognition rates of 96.3% and 91.7%, respectively. In conclusion, we presented an effective automated musical score recognition system that could be integrated in a system with a media player to play music/songs given input images of musical score. Ultimately, this system could also be incorporated in applications for mobile devices as a learning tool, such that a music player could learn to play music/songs.

Keywords: Connected component labeling, image processing, morphological processing, optical musical recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
557 Applying Clustering of Hierarchical K-means-like Algorithm on Arabic Language

Authors: Sameh H. Ghwanmeh

Abstract:

In this study a clustering technique has been implemented which is K-Means like with hierarchical initial set (HKM). The goal of this study is to prove that clustering document sets do enhancement precision on information retrieval systems, since it was proved by Bellot & El-Beze on French language. A comparison is made between the traditional information retrieval system and the clustered one. Also the effect of increasing number of clusters on precision is studied. The indexing technique is Term Frequency * Inverse Document Frequency (TF * IDF). It has been found that the effect of Hierarchical K-Means Like clustering (HKM) with 3 clusters over 242 Arabic abstract documents from the Saudi Arabian National Computer Conference has significant results compared with traditional information retrieval system without clustering. Additionally it has been found that it is not necessary to increase the number of clusters to improve precision more.

Keywords: Hierarchical K-mean like clustering (HKM), Kmeans, cluster centroids, initial partition, and document distances

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2572
556 Evolutionary Query Optimization for Heterogeneous Distributed Database Systems

Authors: Reza Ghaemi, Amin Milani Fard, Hamid Tabatabaee, Mahdi Sadeghizadeh

Abstract:

Due to new distributed database applications such as huge deductive database systems, the search complexity is constantly increasing and we need better algorithms to speedup traditional relational database queries. An optimal dynamic programming method for such high dimensional queries has the big disadvantage of its exponential order and thus we are interested in semi-optimal but faster approaches. In this work we present a multi-agent based mechanism to meet this demand and also compare the result with some commonly used query optimization algorithms.

Keywords: Information retrieval systems, list fusion methods, document score, multi-agent systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3424
555 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: Feature selection methods, Machine learning, NB, One-class SVM, Sentiment Analysis, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3303
554 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model, where document topics are extracted using LDA. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: Regression model, social mood, stock market prediction, Twitter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2434
553 An Agent Oriented Architecture to Supply Dynamic Document Generation in ERP Systems

Authors: Hassan Haghighi, Seyedeh Zahra Hosseini, Seyedeh Elahe Jalambadani

Abstract:

One of the most important aspects expected from an ERP system is to mange user\administrator manual documents dynamically. Since an ERP package is frequently changed during its implementation in customer sites, it is often needed to add new documents and/or apply required changes to existing documents in order to cover new or changed capabilities. The worse is that since these changes occur continuously, the corresponding documents should be updated dynamically; otherwise, implementing the ERP package in the organization encounters serious risks. In this paper, we propose a new architecture which is based on the agent oriented vision and supplies the dynamic document generation expected from ERP systems using several independent but cooperative agents. Beside the dynamic document generation which is the main issue of this paper, the presented architecture will address some aspects of intelligence and learning capabilities existing in ERP.

Keywords: enterprise resource planning, dynamic documentgeneration, software architecture, agent oriented architecture, learning, intelligence

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656
552 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117
551 EEG Analysis of Brain Dynamics in Children with Language Disorders

Authors: Hamed Alizadeh Dashagholi, Hossein Yousefi-Banaem, Mina Naeimi

Abstract:

Current study established for EEG signal analysis in patients with language disorder. Language disorder can be defined as meaningful delay in the use or understanding of spoken or written language. The disorder can include the content or meaning of language, its form, or its use. Here we applied Z-score, power spectrum, and coherence methods to discriminate the language disorder data from healthy ones. Power spectrum of each channel in alpha, beta, gamma, delta, and theta frequency bands was measured. In addition, intra hemispheric Z-score obtained by scoring algorithm. Obtained results showed high Z-score and power spectrum in posterior regions. Therefore, we can conclude that peoples with language disorder have high brain activity in frontal region of brain in comparison with healthy peoples. Results showed that high coherence correlates with irregularities in the ERP and is often found during complex task, whereas low coherence is often found in pathological conditions. The results of the Z-score analysis of the brain dynamics showed higher Z-score peak frequency in delta, theta and beta sub bands of Language Disorder patients. In this analysis there were activity signs in both hemispheres and the left-dominant hemisphere was more active than the right.

Keywords: EEG, electroencephalography, coherence methods, language disorder, power spectrum, z-score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2550
550 Estimation of Skew Angle in Binary Document Images Using Hough Transform

Authors: Nandini N., Srikanta Murthy K., G. Hemantha Kumar

Abstract:

This paper includes two novel techniques for skew estimation of binary document images. These algorithms are based on connected component analysis and Hough transform. Both these methods focus on reducing the amount of input data provided to Hough transform. In the first method, referred as word centroid approach, the centroids of selected words are used for skew detection. In the second method, referred as dilate & thin approach, the selected characters are blocked and dilated to get word blocks and later thinning is applied. The final image fed to Hough transform has the thinned coordinates of word blocks in the image. The methods have been successful in reducing the computational complexity of Hough transform based skew estimation algorithms. Promising experimental results are also provided to prove the effectiveness of the proposed methods.

Keywords: Dilation, Document processing, Hough transform, Optical Character Recognition, Skew estimation, and Thinning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3266
549 The Amino-Acid Score and Physical Growth: Implications for the Assessment of Protein Quality

Authors: P. Grasgruber, J. Cacek, S. Hřebíčková

Abstract:

The purpose of this study was to test the reliability of various standards that assess the quality of proteins via the “amino-acid score” and serve as a nutritional guideline for both children and adults. The height of young men in 42 European countries, Australia, New Zealand and USA was compared with the average consumption of food (after FAOSTAT, 2009) and a subsequent statistical analysis identified types of food with the most pronounced effect on physical growth. The results show that milk products and pork meat are by far the most significant nutritional factors in this regard. Cereals, vegetables and especially wheat played a strongly negative role. The results generally agreed best with the amino-acid score of proteins according to the standard of FAO 1985. In our opinion, the new standard of FAO 2007 underestimates the importance of tryptophan, which should provoke a debate about new modifications of the FAO guidelines.

Keywords: Protein quality, amino-acid score, physical growth, male height.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3784
548 Schema and Data Migration of a Relational Database RDB to the Extensible Markup Language XML

Authors: Alae El Alami, Mohamed Bahaj

Abstract:

This article discusses the passage of RDB to XML documents (schema and data) based on metadata and semantic enrichment, which makes the RDB under flattened shape and is enriched by the object concept. The integration and exploitation of the object concept in the XML uses a syntax allowing for the verification of the conformity of the document XML during the creation. The information extracted from the RDB is therefore analyzed and filtered in order to adjust according to the structure of the XML files and the associated object model. Those implemented in the XML document through a SQL query are built dynamically. A prototype was implemented to realize automatic migration, and so proves the effectiveness of this particular approach.

Keywords: RDB, XML, DTD, semantic enrichment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1824
547 Bottom Up Text Mining through Hierarchical Document Representation

Authors: Y. Djouadi., F. Souam.

Abstract:

Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.

Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057
546 Implementation of RSA Blind Signature on CryptO-0N2 Protocol

Authors: Esti Rahmawati Agustina, Is Esti Firmanesa

Abstract:

Blind Signature were introduced by Chaum. In this scheme, a signer can “sign” a document without knowing the document contain. This is particularly important in electronic voting. CryptO-0N2 is an electronic voting protocol which is development of CryptO-0N. During its development this protocol has not been furnished with the requirement of blind signature, so the choice of voters can be determined by counting center. In this paper will be presented of implementation of blind signature using RSA algorithm.

Keywords: Blind signature, electronic voting protocol, RSA algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3192
545 Identifying Missing Component in the Bechdel Test Using Principal Component Analysis Method

Authors: Raghav Lakhotia, Chandra Kanth Nagesh, Krishna Madgula

Abstract:

A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.

Keywords: Bechdel test, dialogue genre, parts of speech tags, principal component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 798
544 Application of Generalized Autoregressive Score Model to Stock Returns

Authors: Katleho Daniel Makatjane, Diteboho Lawrence Xaba, Ntebogang Dinah Moroke

Abstract:

The current study investigates the behaviour of time-varying parameters that are based on the score function of the predictive model density at time t. The mechanism to update the parameters over time is the scaled score of the likelihood function. The results revealed that there is high persistence of time-varying, as the location parameter is higher and the skewness parameter implied the departure of scale parameter from the normality with the unconditional parameter as 1.5. The results also revealed that there is a perseverance of the leptokurtic behaviour in stock returns which implies the returns are heavily tailed. Prior to model estimation, the White Neural Network test exposed that the stock price can be modelled by a GAS model. Finally, we proposed further researches specifically to model the existence of time-varying parameters with a more detailed model that encounters the heavy tail distribution of the series and computes the risk measure associated with the returns.

Keywords: Generalized autoregressive score model, stock returns, time-varying.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1034
543 A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays

Authors: M. Anidha, K. Premalatha

Abstract:

Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.

Keywords: Gene selection, mutual information, Fisher score, classification, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152
542 Evaluating Health-Related Quality of Life of Lost to Follow-Up Tuberculosis Patients in Yemen

Authors: Ammar Ali Saleh Jaber, Amer Hayat Khan, Syed Azhar Syed Sulaiman

Abstract:

Tuberculosis (TB) is considered as a major disease that affects daily activities and impairs health-related quality of life (HRQoL). The impact of TB on HRQoL can affect treatment outcome and may lead to treatment defaulting. Therefore, this study aims to evaluate the HRQoL of TB treatment lost to follow-up during and after treatment in Yemen. For this aim, this prospective study enrolled a total of 399 TB lost to follow-up patients between January 2011 and December 2015. By applying HRQoL criteria, only 136 fill the survey during treatment. Moreover, 96 were traced and fill out the HRQoL survey. All eight HRQol domains were categorized into the physical component score (PCS) and mental component score (MCS), which were calculated using QM scoring software. Results show that all lost to follow-up TB patients reported a score less than 47 for all eight domains, except general health (67.3) during their treatment period. Low scores of 27.9 and 29.8 were reported for emotional role limitation (RE) and mental health (MH), respectively. Moreover, the mental component score (MCS) was found to be only 28.9. The trace lost follow-up shows a significant improvement in all eight domains and a mental component score of 43.1. The low scores of 27.9 and 29.8 for role emotion and mental health, respectively, in addition to the MCS score of 28.9, show that severe emotional condition and reflect the higher depression during treatment period that can result to lost to follow-up. The low MH, RE, and MCS can be used as a clue for predicting future TB treatment lost to follow-up.

Keywords: Yemen, tuberculosis, health-related quality of life, khat.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 886
541 Ontology-based Concept Weighting for Text Documents

Authors: Hmway Hmway Tar, Thi Thi Soe Nyaunt

Abstract:

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Keywords: Clustering, Concept Weight, Document clustering, Feature Selection, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2405
540 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
539 Improved Weighted Matching for Speaker Recognition

Authors: Ozan Mut, Mehmet Göktürk

Abstract:

Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.

Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
538 Graphic Watermarking, Security Feature in Cadastral Content Management

Authors: Manole Velicanu, Emanuil Rednic

Abstract:

The paper shows the necessity to increase the security level for paper management in the cadastral field by using specific graphical watermarks. Using the graphical watermarking will increase the security in the cadastral content management; furthermore any altered document will be validated afterwards of its originality by checking the graphic watermark. If, by any reasons the document is changed for counterfeiting, it is invalidated and found that is an illegal copy due to the graphic check of the watermarking, check made at pixel level

Keywords: cadastral system, database security, security standards, content management, identity management, watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526
537 Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2811