Search results for: document warehouse
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 313

Search results for: document warehouse

253 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2062
252 Experiments on Element and Document Statistics for XML Retrieval

Authors: Mohamed Ben Aouicha, Mohamed Tmar, Mohand Boughanem, Mohamed Abid

Abstract:

This paper presents an information retrieval model on XML documents based on tree matching. Queries and documents are represented by extended trees. An extended tree is built starting from the original tree, with additional weighted virtual links between each node and its indirect descendants allowing to directly reach each descendant. Therefore only one level separates between each node and its indirect descendants. This allows to compare the user query and the document with flexibility and with respect to the structural constraints of the query. The content of each node is very important to decide weither a document element is relevant or not, thus the content should be taken into account in the retrieval process. We separate between the structure-based and the content-based retrieval processes. The content-based score of each node is commonly based on the well-known Tf × Idf criteria. In this paper, we compare between this criteria and another one we call Tf × Ief. The comparison is based on some experiments into a dataset provided by INEX1 to show the effectiveness of our approach on one hand and those of both weighting functions on the other.

Keywords: XML retrieval, INEX, Tf × Idf, Tf × Ief

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
251 The Study of Cost Accounting in S Company Based On TDABC

Authors: Heng Ma

Abstract:

Third-party warehousing logistics has an important role in the development of external logistics. At present, the third-party logistics in our country is still a new industry, the accounting system has not yet been established, the current financial accounting system of third-party warehousing logistics is mainly in the traditional way of thinking, and only able to provide the total cost information of the entire enterprise during the accounting period, unable to reflect operating indirect cost information. In order to solve the problem of third-party logistics industry cost information distortion, improve the level of logistics cost management, the paper combines theoretical research and case analysis method to reflect cost allocation by building third-party logistics costing model using Time-Driven Activity-Based Costing(TDABC), and takes S company as an example to account and control the warehousing logistics cost.Based on the idea of “Products consume activities and activities consume resources”, TDABC put time into the main cost driver and use time-consuming equation resources assigned to cost objects. In S company, the objects focuses on three warehouse, engaged with warehousing and transportation (the second warehouse, transport point) service. These three warehouse respectively including five departments, Business Unit, Production Unit, Settlement Center, Security Department and Equipment Division, the activities in these departments are classified by in-out of storage forecast, in-out of storage or transit and safekeeping work. By computing capacity cost rate, building the time-consuming equation, the paper calculates the final operation cost so as to reveal the real cost.The numerical analysis results show that the TDABC can accurately reflect the cost allocation of service customers and reveal the spare capacity cost of resource center, verifies the feasibility and validity of TDABC in third-party logistics industry cost accounting. It inspires enterprises focus on customer relationship management and reduces idle cost to strengthen the cost management of third-party logistics enterprises.

Keywords: Third-party logistics enterprises, TDABC, cost management, S company.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2378
250 Estimation of Skew Angle in Binary Document Images Using Hough Transform

Authors: Nandini N., Srikanta Murthy K., G. Hemantha Kumar

Abstract:

This paper includes two novel techniques for skew estimation of binary document images. These algorithms are based on connected component analysis and Hough transform. Both these methods focus on reducing the amount of input data provided to Hough transform. In the first method, referred as word centroid approach, the centroids of selected words are used for skew detection. In the second method, referred as dilate & thin approach, the selected characters are blocked and dilated to get word blocks and later thinning is applied. The final image fed to Hough transform has the thinned coordinates of word blocks in the image. The methods have been successful in reducing the computational complexity of Hough transform based skew estimation algorithms. Promising experimental results are also provided to prove the effectiveness of the proposed methods.

Keywords: Dilation, Document processing, Hough transform, Optical Character Recognition, Skew estimation, and Thinning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3224
249 Schema and Data Migration of a Relational Database RDB to the Extensible Markup Language XML

Authors: Alae El Alami, Mohamed Bahaj

Abstract:

This article discusses the passage of RDB to XML documents (schema and data) based on metadata and semantic enrichment, which makes the RDB under flattened shape and is enriched by the object concept. The integration and exploitation of the object concept in the XML uses a syntax allowing for the verification of the conformity of the document XML during the creation. The information extracted from the RDB is therefore analyzed and filtered in order to adjust according to the structure of the XML files and the associated object model. Those implemented in the XML document through a SQL query are built dynamically. A prototype was implemented to realize automatic migration, and so proves the effectiveness of this particular approach.

Keywords: RDB, XML, DTD, semantic enrichment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1790
248 Bottom Up Text Mining through Hierarchical Document Representation

Authors: Y. Djouadi., F. Souam.

Abstract:

Most of the existing text mining approaches are proposed, keeping in mind, transaction databases model. Thus, the mined dataset is structured using just one concept: the “transaction", whereas the whole dataset is modeled using the “set" abstract type. In such cases, the structure of the whole dataset and the relationships among the transactions themselves are not modeled and consequently, not considered in the mining process. We believe that taking into account structure properties of hierarchically structured information (e.g. textual document, etc ...) in the mining process, can leads to best results. For this purpose, an hierarchical associations rule mining approach for textual documents is proposed in this paper and the classical set-oriented mining approach is reconsidered profits to a Direct Acyclic Graph (DAG) oriented approach. Natural languages processing techniques are used in order to obtain the DAG structure. Based on this graph model, an hierarchical bottom up algorithm is proposed. The main idea is that each node is mined with its parent node.

Keywords: Graph based association rules mining, Hierarchical document structure, Text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2011
247 Five Vital Factors Related to Employees’ Job Performance

Authors: Siri-orn Champatong

Abstract:

The purpose of this research was to study five vital factors related to employees’ job performance. A total of 250 respondents were sampled from employees who worked at a public warehouse organization, Bangkok, Thailand. Samples were divided into two groups according to their work experience. The average working experience was about 9 years for group one and 28 years for group two. A questionnaire was utilized as a tool to collect data. Statistics utilized in this research included frequency, percentage, mean, standard deviation, t-test analysis, one way ANOVA, and Pearson Product-moment correlation coefficient. Data were analyzed by using Statistical Package for the Social Sciences. The findings disclosed that the majority of respondents were female between 23- 31 years old, single, and hold an undergraduate degree. The average income of respondents was less than 30,900 baht. The findings also revealed that the factors of organization chart awareness, job process and technology, internal environment, employee loyalty, and policy and management were ranked as medium level. The hypotheses testing revealed that difference in gender, age, and position had differences in terms of the awareness of organization chart, job process and technology, internal environment, employee loyalty, and policy and management in the same direction with low level.

Keywords: Employees, Factors Related, Job Performance, Public Warehouse Organization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
246 Implementation of RSA Blind Signature on CryptO-0N2 Protocol

Authors: Esti Rahmawati Agustina, Is Esti Firmanesa

Abstract:

Blind Signature were introduced by Chaum. In this scheme, a signer can “sign” a document without knowing the document contain. This is particularly important in electronic voting. CryptO-0N2 is an electronic voting protocol which is development of CryptO-0N. During its development this protocol has not been furnished with the requirement of blind signature, so the choice of voters can be determined by counting center. In this paper will be presented of implementation of blind signature using RSA algorithm.

Keywords: Blind signature, electronic voting protocol, RSA algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3149
245 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Authors: Nevena Stolba, A Min Tjoa

Abstract:

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3739
244 Ontology-based Concept Weighting for Text Documents

Authors: Hmway Hmway Tar, Thi Thi Soe Nyaunt

Abstract:

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Keywords: Clustering, Concept Weight, Document clustering, Feature Selection, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2362
243 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1696
242 Graphic Watermarking, Security Feature in Cadastral Content Management

Authors: Manole Velicanu, Emanuil Rednic

Abstract:

The paper shows the necessity to increase the security level for paper management in the cadastral field by using specific graphical watermarks. Using the graphical watermarking will increase the security in the cadastral content management; furthermore any altered document will be validated afterwards of its originality by checking the graphic watermark. If, by any reasons the document is changed for counterfeiting, it is invalidated and found that is an illegal copy due to the graphic check of the watermarking, check made at pixel level

Keywords: cadastral system, database security, security standards, content management, identity management, watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488
241 Multi-agent Data Fusion Architecture for Intelligent Web Information Retrieval

Authors: Amin Milani Fard, Mohsen Kahani, Reza Ghaemi, Hamid Tabatabaee

Abstract:

In this paper we propose a multi-agent architecture for web information retrieval using fuzzy logic based result fusion mechanism. The model is designed in JADE framework and takes advantage of JXTA agent communication method to allow agent communication through firewalls and network address translators. This approach enables developers to build and deploy P2P applications through a unified medium to manage agent-based document retrieval from multiple sources.

Keywords: Information retrieval systems, list fusion methods, document score, multi-agent systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1545
240 Mining News Sites to Create Special Domain News Collections

Authors: David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Abstract:

We present a method to create special domain collections from news sites. The method only requires a single sample article as a seed. No prior corpus statistics are needed and the method is applicable to multiple languages. We examine various similarity measures and the creation of document collections for English and Japanese. The main contributions are as follows. First, the algorithm can build special domain collections from as little as one sample document. Second, unlike other algorithms it does not require a second “general" corpus to compute statistics. Third, in our testing the algorithm outperformed others in creating collections made up of highly relevant articles.

Keywords: Information Retrieval, News, Special DomainCollections,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438
239 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 622
238 An Empirical Analysis of Earnings Management in Australia

Authors: Lan Sun, Subhrendu Rath

Abstract:

This is a comprehensive large-sample study of Australian earnings management. Using a sample of 4,844 firm-year observations across nine Australia industries from 2000 to 2006, we find substantial corporate earnings management activity across several Australian industries. We document strong evidence of size and return on assets being primary determinants of earnings management in Australia. The effects of size and return on assets are also found to be dominant in both income-increasing and incomedecreasing earnings manipulation. We also document that that periphery sector firms are more likely to involve larger magnitude of earnings management than firms in the core sector.

Keywords: Earnings management, discretionary accruals, income-increasing/decreasing manipulation, dual economy sector

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3687
237 Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Authors: T. Thendral, M. S. Vijaya, S. Karpagavalli

Abstract:

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Keywords: Classification, Feature extraction, Support vector machine, Training, Writer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2277
236 Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Authors: T. Thendral, M. S. Vijaya, S. Karpagavalli

Abstract:

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Keywords: Classification, Feature extraction, Support vector machine, Training, Writer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
235 Extraction of Significant Phrases from Text

Authors: Yuan J. Lui

Abstract:

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.

Keywords: classification, keyphrase extraction, machine learning, summarization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2008
234 Compression of Semistructured Documents

Authors: Leo Galambos, Jan Lansky, Katsiaryna Chernik

Abstract:

EGOTHOR is a search engine that indexes the Web and allows us to search the Web documents. Its hit list contains URL and title of the hits, and also some snippet which tries to shortly show a match. The snippet can be almost always assembled by an algorithm that has a full knowledge of the original document (mostly HTML page). It implies that the search engine is required to store the full text of the documents as a part of the index. Such a requirement leads us to pick up an appropriate compression algorithm which would reduce the space demand. One of the solutions could be to use common compression methods, for instance gzip or bzip2, but it might be preferable if we develop a new method which would take advantage of the document structure, or rather, the textual character of the documents. There already exist a special compression text algorithms and methods for a compression of XML documents. The aim of this paper is an integration of the two approaches to achieve an optimal level of the compression ratio

Keywords: Compression, search engine, HTML, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
233 Simultaneous Segmentation and Recognition of Arabic Characters in an Unconstrained On-Line Cursive Handwritten Document

Authors: Randa I. Elanwar, Mohsen A. Rashwan, Samia A. Mashali

Abstract:

The last two decades witnessed some advances in the development of an Arabic character recognition (CR) system. Arabic CR faces technical problems not encountered in any other language that make Arabic CR systems achieve relatively low accuracy and retards establishing them as market products. We propose the basic stages towards a system that attacks the problem of recognizing online Arabic cursive handwriting. Rule-based methods are used to perform simultaneous segmentation and recognition of word portions in an unconstrained cursively handwritten document using dynamic programming. The output of these stages is in the form of a ranked list of the possible decisions. A new technique for text line separation is also used.

Keywords: Arabic handwriting, character recognition, cursive handwriting, on-line recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1851
232 IFDewey: A New Insert-Friendly Labeling Schemafor XML Data

Authors: S. Soltan, A. Zarnani, R. AliMohammadzadeh, M. Rahgozar

Abstract:

XML has become a popular standard for information exchange via web. Each XML document can be presented as a rooted, ordered, labeled tree. The Node label shows the exact position of a node in the original document. Region and Dewey encoding are two famous methods of labeling trees. In this paper, we propose a new insert friendly labeling method named IFDewey based on recently proposed scheme, called Extended Dewey. In Extended Dewey many labels must be modified when a new node is inserted into the XML tree. Our method eliminates this problem by reserving even numbers for future insertion. Numbers generated by Extended Dewey may be even or odd. IFDewey modifies Extended Dewey so that only odd numbers are generated and even numbers can then be used for a much easier insertion of nodes.

Keywords: XML, tree labeling, query processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591
231 Software Engineering Interoperable Environment for University Process Workflow and Document Management

Authors: Bekim Fetaji, Majlinda Fetaji, Mirlinda Ebibi

Abstract:

The objective of the research was focused on the design, development and evaluation of a sustainable web based network system to be used as an interoperable environment for University process workflow and document management. In this manner the most of the process workflows in Universities can be entirely realized electronically and promote integrated University. Definition of the most used University process workflows enabled creating electronic workflows and their execution on standard workflow execution engines. Definition or reengineering of workflows provided increased work efficiency and helped in having standardized process through different faculties. The concept and the process definition as well as the solution applied as Case study are evaluated and findings are reported.

Keywords: design process workflows, workflow and documentmanagement, Business Process, software engineering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1277
230 Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Body Enhancement, Lexis, Medicinal, Unsolicited Bulk e-mail (UBE), Vector Space Document Model, Viagra

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3455
229 Popularization of the Communist Manifesto in 19th Century Europe

Authors: Xuanyu Bai

Abstract:

“The Communist Manifesto”, written by Karl Marx and Friedrich Engels, is one of the most significant documents throughout the whole history which covers across different fields including Economic, Politic, Sociology and Philosophy. Instead of discussing the Communist ideas presented in the Communist Manifesto, the essay focuses on exploring the reasons that contributed to the popularization of the document and its influence on political revolutions in 19th century Europe by concentrating on the document itself along with other primary and secondary sources and temporal artwork. Combining the details from the Communist Manifesto and other documents, Marx’s writing style and word choice, his convincible notions about a new society dominated by proletariats, and the revolutionary idea of class destruction has led to the popularization of the Communist Manifesto and influenced the latter political revolutions.

Keywords: Communist Manifesto, The Wealth of Nations, 19th century Europe, word choice, capitalism, communism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 495
228 Support Vector Machine for Persian Font Recognition

Authors: A. Borji, M. Hamidi

Abstract:

In this paper we examine the use of global texture analysis based approaches for the purpose of Persian font recognition in machine-printed document images. Most existing methods for font recognition make use of local typographical features and connected component analysis. However derivation of such features is not an easy task. Gabor filters are appropriate tools for texture analysis and are motivated by human visual system. Here we consider document images as textures and use Gabor filter responses for identifying the fonts. The method is content independent and involves no local feature analysis. Two different classifiers Weighted Euclidean Distance and SVM are used for the purpose of classification. Experiments on seven different type faces and four font styles show average accuracy of 85% with WED and 82% with SVM classifier over typefaces

Keywords: Persian font recognition, support vector machine, gabor filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1662
227 Content-based Retrieval of Medical Images

Authors: Lilac A. E. Al-Safadi

Abstract:

With the advance of multimedia and diagnostic images technologies, the number of radiographic images is increasing constantly. The medical field demands sophisticated systems for search and retrieval of the produced multimedia document. This paper presents an ongoing research that focuses on the semantic content of radiographic image documents to facilitate semantic-based radiographic image indexing and a retrieval system. The proposed model would divide a radiographic image document, based on its semantic content, and would be converted into a logical structure or a semantic structure. The logical structure represents the overall organization of information. The semantic structure, which is bound to logical structure, is composed of semantic objects with interrelationships in the various spaces in the radiographic image.

Keywords: Semantic Indexing, Content-Based Retrieval, Radiographic Images, Data Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
226 Extending the Conceptual Neighborhood Graph of the Relations for the Semantic Adaptation of Multimedia Documents

Authors: Azze-Eddine Maredj, Nourredine Tonkin

Abstract:

The recent developments in computing and communication technology permit to users to access multimedia documents with variety of devices (PCs, PDAs, mobile phones...) having heterogeneous capabilities. This diversification of supports has trained the need to adapt multimedia documents according to their execution contexts. A semantic framework for multimedia document adaptation based on the conceptual neighborhood graphs was proposed. In this framework, adapting consists on finding another specification that satisfies the target constraints and which is as close as possible from the initial document. In this paper, we propose a new way of building the conceptual neighborhood graphs to best preserve the proximity between the adapted and the original documents and to deal with more elaborated relations models by integrating the relations relaxation graphs that permit to handle the delays and the distances defined within the relations.

Keywords: Conceptual Neighborhood Graph, Relaxation Graphs, Relations with Delays, Semantic Adaptation of Multimedia Documents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501
225 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516
224 Control Configuration System as a Key Element in Distributed Control System

Authors: Goodarz Sabetian, Sajjad Moshfe

Abstract:

Control system for hi-tech industries could be realized generally and deeply by a special document. Vast heavy industries such as power plants with a large number of I/O signals are controlled by a distributed control system (DCS). This system comprises of so many parts from field level to high control level, and junior instrument engineers may be confused by this enormous information. The key document which can solve this problem is “control configuration system diagram” for each type of DCS. This is a road map that covers all of activities respect to control system in each industrial plant and inevitable to be studied by whom corresponded. It plays an important role from designing control system start point until the end; deliver the system to operate. This should be inserted in bid documents, contracts, purchasing specification and used in different periods of project EPC (engineering, procurement, and construction). Separate parts of DCS are categorized here in order of importance and a brief description and some practical plan is offered. This article could be useful for all instrument and control engineers who worked is EPC projects.

Keywords: Control, configuration, DCS, power plant, bus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1174