Search results for: document

201 Control Configuration System as a Key Element in Distributed Control System

Authors: Goodarz Sabetian, Sajjad Moshfe

Abstract:

Control system for hi-tech industries could be realized generally and deeply by a special document. Vast heavy industries such as power plants with a large number of I/O signals are controlled by a distributed control system (DCS). This system comprises of so many parts from field level to high control level, and junior instrument engineers may be confused by this enormous information. The key document which can solve this problem is “control configuration system diagram” for each type of DCS. This is a road map that covers all of activities respect to control system in each industrial plant and inevitable to be studied by whom corresponded. It plays an important role from designing control system start point until the end; deliver the system to operate. This should be inserted in bid documents, contracts, purchasing specification and used in different periods of project EPC (engineering, procurement, and construction). Separate parts of DCS are categorized here in order of importance and a brief description and some practical plan is offered. This article could be useful for all instrument and control engineers who worked is EPC projects.

Keywords: Control, configuration, DCS, power plant, bus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1172

200 Algorithm for Information Retrieval Optimization

Authors: Kehinde K. Agbele, Kehinde Daniel Aruleba, Eniafe F. Ayetiran

Abstract:

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

Keywords: Internet ranking,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1431

199 Clustering Unstructured Text Documents Using Fading Function

Authors: Pallav Roxy, Durga Toshniwal

Abstract:

Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.

Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1942

198 Text Summarization for Oil and Gas News Article

Authors: L. H. Chong, Y. Y. Chen

Abstract:

Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.

Keywords: Information retrieval, text summarization, statistical approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559

197 Suitability of Requirements Abstraction Model (RAM) Requirements for High-Level System Testing

Authors: Naeem Muhammad, Yves Vandewoude, Yolande Berbers, Robert Feldt

Abstract:

The Requirements Abstraction Model (RAM) helps in managing abstraction in requirements by organizing them at four levels (product, feature, function and component). The RAM is adaptable and can be tailored to meet the needs of the various organizations. Because software requirements are an important source of information for developing high-level tests, organizations willing to adopt the RAM model need to know the suitability of the RAM requirements for developing high-level tests. To investigate this suitability, test cases from twenty randomly selected requirements were developed, analyzed and graded. Requirements were selected from the requirements document of a Course Management System, a web based software system that supports teachers and students in performing course related tasks. This paper describes the results of the requirements document analysis. The results show that requirements at lower levels in the RAM are suitable for developing executable tests whereas it is hard to develop from requirements at higher levels.

Keywords: Market-driven requirements engineering, requirements abstraction model, requirements abstraction, system testing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924

196 Semantic Indexing Approach of a Corpora Based On Ontology

Authors: Mohammed Erritali

Abstract:

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. This paper presents a new semantic indexing approach of a documentary corpus. The indexing process starts first by a term weighting phase to determine the importance of these terms in the documents. Then the use of a thesaurus like Wordnet allows moving to the conceptual level. Each candidate concept is evaluated by determining its level of representation of the document, that is to say, the importance of the concept in relation to other concepts of the document. Finally, the semantic index is constructed by attaching to each concept of the ontology, the documents of the corpus in which these concepts are found.

Keywords: Semantic, indexing, corpora, WordNet, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1331

195 DocPro: A Framework for Processing Semantic and Layout Information in Business Documents

Authors: Ming-Jen Huang, Chun-Fang Huang, Chiching Wei

Abstract:

With the recent advance of the deep neural network, we observe new applications of NLP (natural language processing) and CV (computer vision) powered by deep neural networks for processing business documents. However, creating a real-world document processing system needs to integrate several NLP and CV tasks, rather than treating them separately. There is a need to have a unified approach for processing documents containing textual and graphical elements with rich formats, diverse layout arrangement, and distinct semantics. In this paper, a framework that fulfills this unified approach is presented. The framework includes a representation model definition for holding the information generated by various tasks and specifications defining the coordination between these tasks. The framework is a blueprint for building a system that can process documents with rich formats, styles, and multiple types of elements. The flexible and lightweight design of the framework can help build a system for diverse business scenarios, such as contract monitoring and reviewing.

Keywords: Document processing, framework, formal definition, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 578

194 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, Opinion detection, SentiWordNet, trust score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691

193 Powerful Tool to Expand Business Intelligence: Text Mining

Authors: Li Gao, Elizabeth Chang, Song Han

Abstract:

With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.

Keywords: Business intelligence, document warehouse, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2609

192 Secure Text Steganography for Microsoft Word Document

Authors: Khan Farhan Rafat, M. Junaid Hussain

Abstract:

Seamless modification of an entity for the purpose of hiding a message of significance inside its substance in a manner that the embedding remains oblivious to an observer is known as steganography. Together with today's pervasive registering frameworks, steganography has developed into a science that offers an assortment of strategies for stealth correspondence over the globe that must, however, need a critical appraisal from security breach standpoint. Microsoft Word is amongst the preferably used word processing software, which comes as a part of the Microsoft Office suite. With a user-friendly graphical interface, the richness of text editing, and formatting topographies, the documents produced through this software are also most suitable for stealth communication. This research aimed not only to epitomize the fundamental concepts of steganography but also to expound on the utilization of Microsoft Word document as a carrier for furtive message exchange. The exertion is to examine contemporary message hiding schemes from security aspect so as to present the explorative discoveries and suggest enhancements which may serve a wellspring of information to encourage such futuristic research endeavors.

Keywords: Hiding information in plain sight, stealth communication, oblivious information exchange, conceal, steganography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558

191 Optimal Document Archiving and Fast Information Retrieval

Authors: Hazem M. El-Bakry, Ahmed A. Mohammed

Abstract:

In this paper, an intelligent algorithm for optimal document archiving is presented. It is kown that electronic archives are very important for information system management. Minimizing the size of the stored data in electronic archive is a main issue to reduce the physical storage area. Here, the effect of different types of Arabic fonts on electronic archives size is discussed. Simulation results show that PDF is the best file format for storage of the Arabic documents in electronic archive. Furthermore, fast information detection in a given PDF file is introduced. Such approach uses fast neural networks (FNNs) implemented in the frequency domain. The operation of these networks relies on performing cross correlation in the frequency domain rather than spatial one. It is proved mathematically and practically that the number of computation steps required for the presented FNNs is less than that needed by conventional neural networks (CNNs). Simulation results using MATLAB confirm the theoretical computations.

Keywords: Information Storage and Retrieval, Electronic Archiving, Fast Information Detection, Cross Correlation, Frequency Domain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532

190 Application of GIS-Based Construction Engineering: An Electronic Document Management System

Authors: Mansour N. Jadid

Abstract:

This paper describes the implementation of a GIS to provide decision support for successfully monitoring the movements and storage of materials, hence ensuring that finished products travel from the point of origin to the destination construction site through the supply-chain management (SCM) system. This system ensures the efficient operation of suppliers, manufacturers, and distributors by determining the shortest path from the point of origin to the final destination to reduce construction costs, minimize time, and enhance productivity. These systems are essential to the construction industry because they reduce costs and save time, thereby improve productivity and effectiveness. This study describes a typical supply-chain model and a geographical information system (GIS)-based SCM that focuses on implementing an electronic document management system, which maps the application framework to integrate geodetic support with the supply-chain system. This process provides guidance for locating the nearest suppliers to fill the information needs of project members in different locations. Moreover, this study illustrates the use of a GIS-based SCM as a collaborative tool in innovative methods for implementing Web mapping services, as well as aspects of their integration by generating an interactive GIS for the construction industry platform.

Keywords: Construction, coordinate, engineering, GIS, management, map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1399

189 Identification of Most Frequently Occurring Lexis in Winnings-announcing Unsolicited Bulke-mails

Authors: Jatinderkumar R. Saini, Apurva A. Desai

Abstract:

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of nearly 3000 winnings-announcing UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the winnings-announcing UBE documents. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexisset in the given UBE and the probability that the given UBE will be the one announcing fake winnings. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in winningsannouncing UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Keywords: Lexis, Unsolicited Bulk e-mail (UBE), Vector SpaceDocument Model, Winnings, Lottery

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490

188 Towards Clustering of Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Jürgen Kilian, Andreas Zulauf

Abstract:

Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.

Keywords: Clustering methods, graph-based patterns, graph similarity, hypertext structures, web structure mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1464

187 Application of a Similarity Measure for Graphs to Web-based Document Structures

Authors: Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, Jürgen Kilian, Max Mühlhauser

Abstract:

Due to the tremendous amount of information provided by the World Wide Web (WWW) developing methods for mining the structure of web-based documents is of considerable interest. In this paper we present a similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as linear integer strings, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments for solving a novel and challenging problem: Measuring the structural similarity of generalized trees. In other words: We first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem for developing a efficient graph similarity measure. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based document structures.

Keywords: Graph similarity, hierarchical and directed graphs, hypertext, generalized trees, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1843

186 A Hybrid Ontology Based Approach for Ranking Documents

Authors: Sarah Motiee, Azadeh Nematzadeh, Mehrnoush Shamsfard

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques to extract phrases from documents and the query and doing stemming on words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done flexible and in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572

185 ORank: An Ontology Based System for Ranking Documents

Authors: Mehrnoush Shamsfard, Azadeh Nematzadeh, Sarah Motiee

Abstract:

Increasing growth of information volume in the internet causes an increasing need to develop new (semi)automatic methods for retrieval of documents and ranking them according to their relevance to the user query. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML documents is proposed and evaluated in various circumstances. Our approach is a combination of conceptual, statistical and linguistic methods. This combination reserves the precision of ranking without loosing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. Then an ontology based conceptual method will be used to annotate documents and expand the query. To expand a query the spread activation algorithm is improved so that the expansion can be done in various aspects. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining conceptual, statistical and linguistic features of documents, (2) expanding the query with its related concepts before comparing to documents, (3) extracting and using both words and phrases to compute relevance degree, (4) improving the spread activation algorithm to do the expansion based on weighted combination of different conceptual relationships and (5) allowing variable document vector dimensions. A ranking system called ORank is developed to implement and test the proposed model. The test results will be included at the end of the paper.

Keywords: Document ranking, Ontology, Spread activation algorithm, Annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1835

184 Encrypter Information Software Using Chaotic Generators

Authors: Cardoza-Avendaño L., López-Gutiérrez R.M., Inzunza-González E., Cruz-Hernández C., García-Guerrero E., Spirin V., Serrano H.

Abstract:

This document shows a software that shows different chaotic generator, as continuous as discrete time. The software gives the option for obtain the different signals, using different parameters and initial condition value. The program shows then critical parameter for each model. All theses models are capable of encrypter information, this software show it too.

Keywords: cryptography, chaotic attractors, software.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449

183 Collaborative Environmental Management: A Case Study Research of Stakeholders’ Collaboration in the Nigerian Oil-producing Region

Authors: Favour Makuochukwu Orji, Yingkui Zhao

Abstract:

A myriad of environmental issues face the Nigerian industrial region, resulting from; oil and gas production, mining, manufacturing and domestic wastes. Amidst these, much effort has been directed by stakeholders in the Nigerian oil producing regions, because of the impacts of the region on the wider Nigerian economy. Although collaborative environmental management has been noted as an effective approach in managing environmental issues, little attention has been given to the roles and practices of stakeholders in effecting a collaborative environmental management framework for the Nigerian oil-producing region. This paper produces a framework to expand and deepen knowledge relating to stakeholders aspects of collaborative roles in managing environmental issues in the Nigeria oil-producing region. The knowledge is derived from analysis of stakeholders’ practices – studied through multiple case studies using document analysis. Selected documents of key stakeholders – Nigerian government agencies, multi-national oil companies and host communities, were analyzed. Open and selective coding was employed manually during document analysis of data collected from the offices and websites of the stakeholders. The findings showed that the stakeholders have a range of roles, practices, interests, drivers and barriers regarding their collaborative roles in managing environmental issues. While they have interests for efficient resource use, compliance to standards, sharing of responsibilities, generating of new solutions, and shared objectives; there is evidence of major barriers and these include resource allocation, disjointed policy, ineffective monitoring, diverse socio- economic interests, lack of stakeholders’ commitment and limited knowledge sharing. However, host communities hold deep concerns over the collaborative roles of stakeholders for economic interests, particularly, where government agencies and multi-national oil companies are involved. With these barriers and concerns, a genuine stakeholders’ collaboration is found to be limited, and as a result, optimal environmental management practices and policies have not been successfully implemented in the Nigeria oil-producing region. A framework is produced that describes practices that characterize collaborative environmental management might be employed to satisfy the stakeholders’ interests. The framework recommends critical factors, based on the findings, which may guide a collaborative environmental management in the oil producing regions. The recommendations are designed to re-define the practices of stakeholders in managing environmental issues in the oil producing regions, not as something wholly new, but as an approach essential for implementing a sustainable environmental policy. This research outcome may clarify areas for future research as well as to contribute to industry guidance in the area of collaborative environmental management.

Keywords: Collaborative environmental management framework, document analysis, case studies, multinational oil companies, Nigerian oil-producing region, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2401

182 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 106

181 Restructuring of XML Documents in the Form of Ontologies

Authors: Jamal Bakkas, Mohamed Bahaj, Abdellatif Soklabi

Abstract:

The intense use of the web has made it a very changing environment, its content is in permanent evolution to adapt to the demands. The standards have accompanied this evolution by passing from standards that regroup data with their presentations without any structuring such as HTML, to standards that separate both and give more importance to the structural aspect of the content such as XML standard and its derivatives. Currently, with the appearance of the Semantic Web, ontologies become increasingly present on the web and standards that allow their representations as OWL and RDF/RDFS begin to gain momentum. This paper provided an automatic method that converts XML schema document to ontologies represented in OWL.

Keywords: XML Schema, OWL, RDB, Mapping, Ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2333

180 Juxtaposing South Africa’s Private Sector and Its Public Service Regarding Innovation Diffusion, to Explore the Obstacles to E-Governance

Authors: Petronella Jonck, Freda van der Walt

Abstract:

Despite the benefits of innovation diffusion in the South African public service, implementation thereof seems to be problematic, particularly with regard to e-governance which would enhance the quality of service delivery, especially accessibility, choice, and mode of operation. This paper reports on differences between the public service and the private sector in terms of innovation diffusion. Innovation diffusion will be investigated to explore identified obstacles that are hindering successful implementation of e-governance. The research inquiry is underpinned by the diffusion of innovation theory, which is premised on the assumption that innovation has a distinct channel, time, and mode of adoption within the organisation. A comparative thematic document analysis was conducted to investigate organisational differences with regard to innovation diffusion. A similar approach has been followed in other countries, where the same conceptual framework has been used to guide document analysis in studies in both the private and the public sectors. As per the recommended conceptual framework, three organisational characteristics were emphasised, namely the external characteristics of the organisation, the organisational structure, and the inherent characteristics of the leadership. The results indicated that the main difference in the external characteristics lies in the focus and the clientele of the private sector. With regard to organisational structure, private organisations have veto power, which is not the case in the public service. Regarding leadership, similarities were observed in social and environmental responsibility and employees’ attitudes towards immediate supervision. Differences identified included risk taking, the adequacy of leadership development, organisational approaches to motivation and involvement in decision making, and leadership style. Due to the organisational differences observed, it is recommended that differentiated strategies be employed to ensure effective innovation diffusion, and ultimately e-governance. It is recommended that the results of this research be used to stimulate discussion on ways to improve collaboration between the mentioned sectors, to capitalise on the benefits of each sector.

Keywords: E-governance, ICT, innovation diffusion, comparative analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1739

179 Design and Construction of PIC-Based IR Remote Control Moving Robot

Authors: Sanda Win, Tin Shein, Khin Maung Latt

Abstract:

This document describes an electronic speed control designed to drive two DC motors from a 6 V battery pack to be controlled by a commercial universal infrared remote control hand set. Conceived for a tank-like vehicle, one motor drives the left side wheels or tracks and the other motor drives the right side. As it is shown here, there is a left-right steering input and a forward– backward throttles input, like would be used on a model car. It is designed using a microcontroller PIC16F873A.

Keywords: Assembly Language, Direction Control, SpeedControl, PIC 16F 873A

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5129

178 SAĞLIK-NET Project in Turkey and HL7 v3 Implementation

Authors: K. Turhan, B. Kurt, E. Uzun

Abstract:

This paper describes Clinical Document Architecture Release Two (CDA R2) standard and a client application for messaging with SAĞLIK-NET project developed by The Ministry of Health of Turkey. CDA R2 , developed by Health Level 7 (HL7) organization and approved by American National Standards Institute (ANSI) in 2004, to standardize medical information to be able to share semantically and syntactically. In this study, a client application compatible with HL7 V3 for a project named SAĞLIKNET, aimed to build a National Health Information System by Turkey. Moreover, CDA conformance of this application will also be evaluated.

Keywords: HL7 V3, CDA, Interoperability, Web Service.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3585

177 Cross-Search Technique and its Visualization of Peer-to-Peer Distributed Clinical Documents

Authors: Yong Jun Choi, Juman Byun, Simon Berkovich

Abstract:

One of the ubiquitous routines in medical practice is searching through voluminous piles of clinical documents. In this paper we introduce a distributed system to search and exchange clinical documents. Clinical documents are distributed peer-to-peer. Relevant information is found in multiple iterations of cross-searches between the clinical text and its domain encyclopedia.

Keywords: Clinical documents, cross-search, document exchange, information retrieval, peer-to-peer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1254

176 Integrating Low and High Level Object Recognition Steps

Authors: András Barta, István Vajk

Abstract:

In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.

Keywords: Object recognition, Bayesian network, Wavelets, Document processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1442

175 Integrating Low and High Level Object Recognition Steps by Probabilistic Networks

Authors: András Barta, István Vajk

Abstract:

In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.

Keywords: Object recognition, Bayesian network, Wavelets, Document processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628

174 Virtual or Virtually U: Educational Institutions in Second Life

Authors: Nancy Jennings, Chris Collins

Abstract:

Educational institutions are increasingly exploring the affordances of 3D virtual worlds for instruction and research, but few studies have been done to document current practices and uses of this emerging technology. This observational survey examines the virtual presences of 170 accredited educational institutions found in one such 3D virtual world called Second Life®, created by San- Francisco based Linden Lab®. The study focuses on what educational institutions look like in this virtual environment, the types of spaces educational institutions are creating or simulating, and what types of activities are being conducted.

Keywords: educational technology, emerging technology, metaverse, Second Life, virtual worlds

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929

173 Wireless Sensor Networks:A Survey on Ultra-Low Power-Aware Design

Authors: Itziar Marín, Eduardo Arceredillo, Aitzol Zuloaga, Jagoba Arias

Abstract:

Distributed wireless sensor network consist on several scattered nodes in a knowledge area. Those sensors have as its only power supplies a pair of batteries that must let them live up to five years without substitution. That-s why it is necessary to develop some power aware algorithms that could save battery lifetime as much as possible. In this is document, a review of power aware design for sensor nodes is presented. As example of implementations, some resources and task management, communication, topology control and routing protocols are named.

Keywords: Low Power Design, Power Awareness, RemoteSensing, Wireless Sensor Networks (WSN).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2144

172 The Semantic Web: a New Approach for Future World Wide Web

Authors: Sahar Nasrolahi, Mahdi Nikdast, Mehrdad Mahdavi Boroujerdi

Abstract:

The purpose of semantic web research is to transform the Web from a linked document repository into a distributed knowledge base and application platform, thus allowing the vast range of available information and services to be more efficiently exploited. As a first step in this transformation, languages such as OWL have been developed. Although fully realizing the Semantic Web still seems some way off, OWL has already been very successful and has rapidly become a defacto standard for ontology development in fields as diverse as geography, geology, astronomy, agriculture, defence and the life sciences. The aim of this paper is to classify key concepts of Semantic Web as well as introducing a new practical approach which uses these concepts to outperform Word Wide Web.

Keywords: Semantic Web, Ontology, OWL, Microformat, Word Wide Web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559