Search results for: Continuity Care Document
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 803

Search results for: Continuity Care Document

803 Research on Applying the Continuity Care Document to Generate a Medical Record with Entry Level

Authors: Hsing-Yi Kao, Der-Ming Liou

Abstract:

Transferring patient information between medical care sites is necessary to deliver better patient care and to reduce medical cost. So developing of electronic medical records is an important trend for the world.The Continuity of Care Document (CCD) is product of collaboration between CDA and CCR standards. In this study, we will develop a system to generate medical records with entry level based on CCD template module.

Keywords: Continuity Care Document, medical record, entrylevel

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1936
802 Knowledge Continuity as a Part of Business Continuity Management

Authors: H. Urbancova, J. Urbanec

Abstract:

Today the intangible assets are the capital of knowledge and are the most important and the most valuable resource for organizations. All employees have knowledge independently of the kind of jobs they do. Knowledge is thus an asset, which influences business operations. The objective of this article is to identify knowledge continuity as an objective of business continuity management. The article has been prepared based on the analysis of secondary sources and the evaluation of primary sources of data by means of a quantitative survey conducted in the Czech Republic. The conclusion of the article is that organizations that apply business continuity management do not focus on the preservation of the knowledge of key employees. Organizations ensure knowledge continuity only intuitively, on a random basis, non-systematically and discontinuously. The non-ensuring of knowledge continuity represents a threat of loss of key knowledge for organizations and can also negatively affect business continuity.

Keywords: Business continuity, knowledge, organizations, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3471
801 Advantages and Disadvantages of Business Continuity Management

Authors: K. Venclova, H. Urbancova, H. Vostra Vydrova

Abstract:

In current global economics the application of Business Continuity Management is the prerequisite for sustainable competitive advantage in an organization. Business Continuity Management is a managerial which identifies the potential impact of losses in an organization. The aim of this paper is to identify and critically evaluate the relative advantages and disadvantages of deploying Business Continuity Management in an organization on the basis of seven criteria. The strongest advantage of Business Continuity Management is in its capacity to identify a crisis situation and help the organization to flexibly and also to keep the critical knowledge within the organization. By contrast the main disadvantage is that establishing Business Continuity Management in an organization is time-consuming and its implementation as an integral part of the organizational culture present significant difficulties.

Keywords: Business continuity management, criteria, advantages, disadvantages, organisations, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13158
800 Generating Arabic Fonts Using Rational Cubic Ball Functions

Authors: Fakharuddin Ibrahim, Jamaludin Md. Ali, Ahmad Ramli

Abstract:

In this paper, we will discuss about the data interpolation by using the rational cubic Ball curve. To generate a curve with a better and satisfactory smoothness, the curve segments must be connected with a certain amount of continuity. The continuity that we will consider is of type G1 continuity. The conditions considered are known as the G1 Hermite condition. A simple application of the proposed method is to generate an Arabic font satisfying the required continuity.

Keywords: Continuity, data interpolation, Hermite condition, rational Ball curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
799 Enabling Integration across Heterogeneous Care Networks

Authors: Federico Cabitza, Marco P. Locatelli, Marcello Sarini, Carla Simone

Abstract:

The paper shows how the CASMAS modeling language, and its associated pervasive computing architecture, can be used to facilitate continuity of care by providing members of patientcentered communities of care with a support to cooperation and knowledge sharing through the usage of electronic documents and digital devices. We consider a scenario of clearly fragmented care to show how proper mechanisms can be defined to facilitate a better integration of practices and information across heterogeneous care networks. The scenario is declined in terms of architectural components and cooperation-oriented mechanisms that make the support reactive to the evolution of the context where these communities operate.

Keywords: Pervasive Computing, Communities of Care, HeterogeneousCare Networks, Multi-Agent System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316
798 Entropy Based Data Hiding for Document Images

Authors: Swetha Kurup, Sridhar G., Sridhar V.

Abstract:

In this paper we present a novel technique for data hiding in binary document images. We use the concept of entropy in order to identify document specific least distortive areas throughout the binary document image. The document image is treated as any other image and the proposed method utilizes the standard document characteristics for the embedding process. Proposed method minimizes perceptual distortion due to embedding and allows watermark extraction without the requirement of any side information at the decoder end.

Keywords: Entropy, Steganography, Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482
797 Usage of Military Continuity Management System for Supporting of Emergency Management

Authors: R. Hajkova, J. Palecek, H. Malachova, A. Oulehlova

Abstract:

Ensuring of continuity of business is basic strategy of every company. Continuity of organization activities includes comprehensive procedures that help in solving unexpected situations of natural and anthropogenic character (for example flood, blaze, economic situations). Planning of continuity operations is a process that helps identify critical processes and implement plans for the security and recovery of key processes. The aim of this article is to demonstrate application of system approach to managing business continuity called business continuity management systems in military issues. This article describes the life cycle of business continuity management which is based on the established cycle PDCA (Plan- Do-Check-Act). After this is carried out by activities which are making by University of Defence during activation of forces and means of the integrated rescue system in case of emergencies - accidents at a nuclear power plant in Czech Republic. Activities of various stages of deployment earmarked forces and resources are managed and evaluated by using MCMS application (Military Continuity Management System).

Keywords: Business continuity management system, emergency management, military, nuclear safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074
796 A New Approach for Flexible Document Categorization

Authors: Jebari Chaker, Ounelli Habib

Abstract:

In this paper we propose a new approach for flexible document categorization according to the document type or genre instead of topic. Our approach implements two homogenous classifiers: contextual classifier and logical classifier. The contextual classifier is based on the document URL, whereas, the logical classifier use the logical structure of the document to perform the categorization. The final categorization is obtained by combining contextual and logical categorizations. In our approach, each document is assigned to all predefined categories with different membership degrees. Our experiments demonstrate that our approach is best than other genre categorization approaches.

Keywords: Categorization, combination, flexible, logicalstructure, genre, category, URL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1434
795 The Usefulness of Logical Structure in Flexible Document Categorization

Authors: Jebari Chaker, Ounalli Habib

Abstract:

This paper presents a new approach for automatic document categorization. Exploiting the logical structure of the document, our approach assigns a HTML document to one or more categories (thesis, paper, call for papers, email, ...). Using a set of training documents, our approach generates a set of rules used to categorize new documents. The approach flexibility is carried out with rule weight association representing your importance in the discrimination between possible categories. This weight is dynamically modified at each new document categorization. The experimentation of the proposed approach provides satisfactory results.

Keywords: categorization rule, document categorization, flexible categorization, logical structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1205
794 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 999
793 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: Grayscale image format, image fusing, SURF detection, YCbCr image format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1108
792 The Preservation of Cultural Heritage: Continuity and Memory

Authors: Andrey R. Khazbulatov, Moldir Nurpeiis

Abstract:

Contemporary science and technologies largely widen the gap between the spiritual and rational of the society. Industrial and technological breakthroughs might radically affect most processes in the society, thus losing the cultural heritage. The thinkers recognized the dangers of the decadence in the first place. In the present article the ways of preserving cultural heritage have been investigated. Memory has always been a necessary condition for selfidentification, - continuity is based on this. The authors have supported the hypothesis that continuity and ethnic memory are the very mechanisms that preserve cultural heritage. Such problemformulating will facilitate another, new look at the material, spiritual and arts spheres of the cultural heritage of numerous ethnic groups. The fundamental works by major European and Kazakh scientists have been taken as a basis for the research done.

Keywords: Continuity, cultural heritage, ethnic memory

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2121
791 Scenarios of Societal Security and Business Continuity Cycles

Authors: Jiří F. Urbánek, Jiří Barta

Abstract:

Societal security, continuity scenarios and methodological cycling approach explained in this article. Namely societal security organizational challenges ask implementation of international standards BS 25999-2 & global ISO 22300 which is a family of standards for business continuity management system. Efficient global organization system is distinguished of high entity´s complexity, connectivity & interoperability, having not only cooperative relations in a fact. Competing business have numerous participating ´enemies´, which are in apparent or hidden opponent and antagonistic roles with prosperous organization system, resulting to a crisis scene or even to a battle theatre. Organization business continuity scenarios are necessary for such ´a play´ preparedness, planning, management & overmastering in real environments.

Keywords: Business Continuity, Societal Security Crisis Scenarios Cycles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2111
790 Combining Color and Layout Features for the Identification of Low-resolution Documents

Authors: Ardhendu Behera, Denis Lalanne, Rolf Ingold

Abstract:

This paper proposes a method, combining color and layout features, for identifying documents captured from lowresolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. The combined color and layout features are arranged in a symbolic file, which is unique for each document and is called the document-s visual signature. Our identification method first uses the color information in the signatures in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining search space. Finally, our experiment considers slide documents, which are often captured using handheld devices.

Keywords: Document color modeling, document visual signature, kernel density estimation, document identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1329
789 Color and Layout-based Identification of Documents Captured from Handheld Devices

Authors: Ardhendu Behera, Denis Lalanne, Rolf Ingold

Abstract:

This paper proposes a method, combining color and layout features, for identifying documents captured from low-resolution handheld devices. On one hand, the document image color density surface is estimated and represented with an equivalent ellipse and on the other hand, the document shallow layout structure is computed and hierarchically represented. Our identification method first uses the color information in the documents in order to focus the search space on documents having a similar color distribution, and finally selects the document having the most similar layout structure in the remaining of the search space.

Keywords: Document color modeling, document visualsignature, kernel density estimation, document identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1517
788 Usage of Military Continuity Management System for Flooding Solution

Authors: J. Palecek, R. Hajkova, A. Oulehlova, H. Malachova

Abstract:

Increase of emergency incidents and crisis situations requires proactive crisis management of authorities and for its solution. Application Business Continuity Management helps the crisis management authorities to quickly and responsibly respond to threats. It also helps effectively and efficiently planning powers and resources. The main goal of this article is describing Military Continuity Management System (MCMS) based on the principles of Business Continuity Management System (BCMS) for dealing with floods in the territory of the selected municipalities. There are explained steps of loading, running and evaluating activities in the software application MCMS. Software MCMS provides complete control over the tasks, contribute a comprehensive and responsible approach solutions to solution floods in the municipality.

Keywords: Business Continuity Management, Floods Plan, Flood Activity, Level of Flood Activity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387
787 Highlighting Document's Structure

Authors: Sylvie Ratté, Wilfried Njomgue, Pierre-André Ménard

Abstract:

In this paper, we present symbolic recognition models to extract knowledge characterized by document structures. Focussing on the extraction and the meticulous exploitation of the semantic structure of documents, we obtain a meaningful contextual tagging corresponding to different unit types (title, chapter, section, enumeration, etc.).

Keywords: Information retrieval, document structures, symbolic grammars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
786 Collaborative Document Evaluation: An Alternative Approach to Classic Peer Review

Authors: J. Beel, B. Gipp

Abstract:

Research papers are usually evaluated via peer review. However, peer review has limitations in evaluating research papers. In this paper, Scienstein and the new idea of 'collaborative document evaluation' are presented. Scienstein is a project to evaluate scientific papers collaboratively based on ratings, links, annotations and classifications by the scientific community using the internet. In this paper, critical success factors of collaborative document evaluation are analyzed. That is the scientists- motivation to participate as reviewers, the reviewers- competence and the reviewers- trustworthiness. It is shown that if these factors are ensured, collaborative document evaluation may prove to be a more objective, faster and less resource intensive approach to scientific document evaluation in comparison to the classical peer review process. It is shown that additional advantages exist as collaborative document evaluation supports interdisciplinary work, allows continuous post-publishing quality assessments and enables the implementation of academic recommendation engines. In the long term, it seems possible that collaborative document evaluation will successively substitute peer review and decrease the need for journals.

Keywords: Peer Review, Alternative, Collaboration, Document Evaluation, Rating, Annotations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
785 Data Extraction of XML Files using Searching and Indexing Techniques

Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare

Abstract:

XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.

Keywords: XML Retrieval, Indexed Search, Information Retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1736
784 Skew Detection Technique for Binary Document Images based on Hough Transform

Authors: Manjunath Aradhya V N, Hemantha Kumar G, Shivakumara P

Abstract:

Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents with different resolutions, to reveal the robustness of the proposed method. The experimental results revealed that the proposed method is accurate compared to the results of well-known existing methods.

Keywords: Optical Character Recognition, Skew angle, Thinning, Hough transform, Document processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053
783 Corporate Knowledge Communication and Knowledge Communication Difficulties

Authors: H. Buluthan Cetintas, M. Nejat Ozupek

Abstract:

Communication is an important factor and a prop in directing corporate activities efficiently, in ensuring the flow of knowledge which is necessary for the continuity of the institution, in creating a common language in the institution, in transferring corporate culture and ultimately in corporate success. The idea of transmitting the knowledge among the workers in a healthy manner has revived knowledge communication. Knowledge communication can be defined as the act of mutual creation and communication of intuitions, assessments, experiences and capabilities, as long as maintained effectively, can provide advantages such as corporate continuity, access to corporate objectives and making true administrative decisions. Although the benefits of the knowledge communication to corporations are known, and the necessary worth and care is given, some hardships may arise which makes it difficult or even block it. In this article, difficulties that prevent knowledge communication will be discussed and solutions will be proposed.

Keywords: Corporate knowledge communication, knowledge communication, knowledge communication barriers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1373
782 Persian/Arabic Document Segmentation Based On Pyramidal Image Structure

Authors: Seyyed Yasser Hashemi, Khalil Monfaredi

Abstract:

Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents.

Keywords: Persian/Arabic document, document segmentation, Pyramidal Image Structure, skew detection and correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
781 Role of Personnel Planning in Business Continuity Management

Authors: M. Königová, J. Fejfar

Abstract:

Business continuity management (BCM) identifies potential external and internal threats to an organization and their impacts to business operations. The goal of the article is to identify, based on the analysis of employee turnover in organizations in the Czech Republic, the role of personnel planning in BCM. The article is organized as follows. The first part of the article concentrates on the theoretical background of the topic. The second part of the article is dedicated to the evaluation of the outcomes of the survey conducted (questionnaire survey), focusing on the analysis of employee turnover in organizations in the Czech Republic. The final part of the article underlines the role of personnel planning in BCM, since poor planning of staff needs in an organization can represent a future threat for business continuity ensuring.

Keywords: Business continuity management, key employees, personnel planning, turnover.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2567
780 A Study on Finding Similar Document with Multiple Categories

Authors: R. Saraçoğlu, N. Allahverdi

Abstract:

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

Keywords: Document similarity, Fuzzy classification, Multiple categories, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1659
779 An Introduction to the Concept of University – Community Business Continuity Management for Disaster Resilient City

Authors: Siyanee Hirunsalee, Paola Rizziand Hidehiko Kanegae

Abstract:

The fundamental objective of the university is to genuinely provide a higher education to mankind and society. Higher education institutions earn billions of dollars in research funds, granted by national government or related institutions, which literally came from taxpayers. Everyday universities consume those grants; in return, provide society with a human resource and research developments. However, not all taxpayers have their major concerns on those researches, other than that they are more curiously to see the project being build tangibly and evidently to certify what they pay for. This paper introduces the concept of University – Community Business Continuity Management for Disaster – Resilient City, which modified the concept of Business Continuity Management (BCM) toward university community to create advancing collaboration leading to the disaster – resilient community and city. This paper focuses on describing in details the backgrounds and principles of the concept and discussing the advantages and limitations of the concept.

Keywords: Business Continuity Management, Disaster ResilientCity, Disaster Management, University – Community Collaboration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721
778 A Base Plan for Tomorrow’s Patient Care Information Systems

Authors: M. Tsirintani

Abstract:

The article is proposing a base plan for the future Patient Care Information Systems in a changing health care environment where it is necessary to assure quality patient care services and reducing cost and where new technology trends give the opportunities to develop clinical applications and services patient focused according to new business objectives.

Keywords: Health care management, planning patient care information system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747
777 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization

Authors: M. F. Zaiyadi, B. Baharudin

Abstract:

Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.

Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2016
776 Information Filtering using Index Word Selection based on the Topics

Authors: Takeru YOKOI, Hidekazu YANAGIMOTO, Sigeru OMATU

Abstract:

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

Keywords: Information Filtering, Sparse NMF, Index wordSelection, User Profile, Chi-squared Measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1407
775 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: Document analysis, sentimental analysis, emotion detection, WEKA tool, NRC Lexicon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404
774 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1865