Search results for: scene text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 671

Search results for: scene text

551 Concrete Recycling in Egypt for Construction Applications: A technical and Financial Feasibility Model

Authors: Omar Farahat Hassanein, A. Samer Ezeldin

Abstract:

The construction industry is a very dynamic field. Every day new technologies and methods are developed to fasten the process and increase its efficiency. Hence, if a project uses fewer resources it will be more efficient.

This paper examines the recycling of concrete construction and demolition (C&D) waste to reuse it as aggregates in on-site applications for construction projects in Egypt and possibly in the Middle East. The study focuses on a stationary plant setting. The machinery set-up used in the plant is analyzed technically and financially.

The findings are gathered and grouped to obtain a comprehensive cost-benefit financial model to demonstrate the feasibility of establishing and operating a concrete recycling plant. Furthermore, a detailed business plan including the time and hierarchy is proposed. 

Keywords: Construction wastes, recycling, sustainability, financial model, concrete recycling, concrete life cycle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3265
550 Ontology-based Concept Weighting for Text Documents

Authors: Hmway Hmway Tar, Thi Thi Soe Nyaunt

Abstract:

Documents clustering become an essential technology with the popularity of the Internet. That also means that fast and high-quality document clustering technique play core topics. Text clustering or shortly clustering is about discovering semantically related groups in an unstructured collection of documents. Clustering has been very popular for a long time because it provides unique ways of digesting and generalizing large amounts of information. One of the issues of clustering is to extract proper feature (concept) of a problem domain. The existing clustering technology mainly focuses on term weight calculation. To achieve more accurate document clustering, more informative features including concept weight are important. Feature Selection is important for clustering process because some of the irrelevant or redundant feature may misguide the clustering results. To counteract this issue, the proposed system presents the concept weight for text clustering system developed based on a k-means algorithm in accordance with the principles of ontology so that the important of words of a cluster can be identified by the weight values. To a certain extent, it has resolved the semantic problem in specific areas.

Keywords: Clustering, Concept Weight, Document clustering, Feature Selection, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2362
549 MIMO Radar-Based System for Structural Health Monitoring and Geophysical Applications

Authors: Davide D’Aria, Paolo Falcone, Luigi Maggi, Aldo Cero, Giovanni Amoroso

Abstract:

The paper presents a methodology for real-time structural health monitoring and geophysical applications. The key elements of the system are a high performance MIMO RADAR sensor, an optical camera and a dedicated set of software algorithms encompassing interferometry, tomography and photogrammetry. The MIMO Radar sensor proposed in this work, provides an extremely high sensitivity to displacements making the system able to react to tiny deformations (up to tens of microns) with a time scale which spans from milliseconds to hours. The MIMO feature of the system makes the system capable of providing a set of two-dimensional images of the observed scene, each mapped on the azimuth-range directions with noticeably resolution in both the dimensions and with an outstanding repetition rate. The back-scattered energy, which is distributed in the 3D space, is projected on a 2D plane, where each pixel has as coordinates the Line-Of-Sight distance and the cross-range azimuthal angle. At the same time, the high performing processing unit allows to sense the observed scene with remarkable refresh periods (up to milliseconds), thus opening the way for combined static and dynamic structural health monitoring. Thanks to the smart TX/RX antenna array layout, the MIMO data can be processed through a tomographic approach to reconstruct the three-dimensional map of the observed scene. This 3D point cloud is then accurately mapped on a 2D digital optical image through photogrammetric techniques, allowing for easy and straightforward interpretations of the measurements. Once the three-dimensional image is reconstructed, a 'repeat-pass' interferometric approach is exploited to provide the user of the system with high frequency three-dimensional motion/vibration estimation of each point of the reconstructed image. At this stage, the methodology leverages consolidated atmospheric correction algorithms to provide reliable displacement and vibration measurements.

Keywords: Interferometry, MIMO RADAR, SAR, tomography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 850
548 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: Fake news detection, types of fake news, machine learning, natural language processing, classification techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
547 Teachers- Perceptions on the Use of E-Books as Textbooks in the Classroom

Authors: Abd Mutalib Embong, Azelin M Noor, Razol Mahari M Ali, Zulqarnain Abu Bakar, Abdur- Rahman Mohamed Amin

Abstract:

At the time where electronic books, or e-Books, offer students a fun way of learning , teachers who are used to the paper text books may find it as a new challenge to use it as a part of learning process. Precisely, there are various types of e-Books available to suit students- knowledge, characteristics, abilities, and interests. The paper discusses teachers- perceptions on the use of ebooks as a paper text book in the classroom. A survey was conducted on 72 teachers who use e-books as textbooks. It was discovered that a majority of these teachers had good perceptions on the use of ebooks. However, they had little problems using the devices. It can be overcome with some strategies and a suggested framework.

Keywords: Classroom, E-books, perception, teacher.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5695
546 Semi-Automatic Analyzer to Detect Authorial Intentions in Scientific Documents

Authors: Kanso Hassan, Elhore Ali, Soule-dupuy Chantal, Tazi Said

Abstract:

Information Retrieval has the objective of studying models and the realization of systems allowing a user to find the relevant documents adapted to his need of information. The information search is a problem which remains difficult because the difficulty in the representing and to treat the natural languages such as polysemia. Intentional Structures promise to be a new paradigm to extend the existing documents structures and to enhance the different phases of documents process such as creation, editing, search and retrieval. The intention recognition of the author-s of texts can reduce the largeness of this problem. In this article, we present intentions recognition system is based on a semi-automatic method of extraction the intentional information starting from a corpus of text. This system is also able to update the ontology of intentions for the enrichment of the knowledge base containing all possible intentions of a domain. This approach uses the construction of a semi-formal ontology which considered as the conceptualization of the intentional information contained in a text. An experiments on scientific publications in the field of computer science was considered to validate this approach.

Keywords: Information research, text analyzes, intentionalstructure, segmentation, ontology, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
545 A Study on Fantasy Images Represented on the Films: Focused on Mise-en-Scène Element

Authors: Somi Nah

Abstract:

The genre of fantasy depicts a world of imagine that triggers popular interest from a created view of world, and a fantasy is defined as a story that illustrates a world of imagine where scientific or horror elements are stand in its center. This study is not focused on the narrative of the fantasy, i.e. not on the adventurous story, but is concentrated on the image of the fantasy to work on its relationship with intended themes and differences among cultures due to meanings of materials. As for films, we have selected some films in the 2000's that are internationally recognized as expressing unique images of fantasy containing the theme of love in them. The selected films are 5 pieces including two European films, Amelie from Montmartre (2001) and The Science of Sleep (2005) and three Asian films, Citizen Dog from Thailand (2004), Memories of Matsuko from Japan (2006), and I'm a Cyborg, but That's OK from Korea (2006). These films share some common characteristics to the effect that they give tiny lessons and feelings for life with expressions of fantasy images as if they were fairy tales for adults and that they lead the audience to reflect on their days and revive forgotten dreams of childhood. We analyze the images of fantasy in each of the films on the basis of the elements of Mise-en-Scène (setting and props, costume, hair and make-up, facial expressions and body language, lighting and color, positioning of characters, and objects within a frame).

Keywords: Mise-en-scène, fantasy images, films, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4877
544 Robot Control by ERPs of Brain Waves

Authors: K. T. Sun, Y. H. Tai, H. W. Yang, H. T. Lin

Abstract:

This paper presented the technique of robot control by event-related potentials (ERPs) of brain waves. Based on the proposed technique, severe physical disabilities can free browse outside world. A specific component of ERPs, N2P3, was found and used to control the movement of robot and the view of camera on the designed brain-computer interface (BCI). Users only required watching the stimuli of attended button on the BCI, the evoked potentials of brain waves of the target button, N2P3, had the greatest amplitude among all control buttons. An experimental scene had been constructed that the robot required walking to a specific position and move the view of camera to see the instruction of the mission, and then completed the task. Twelve volunteers participated in this experiment, and experimental results showed that the correct rate of BCI control achieved 80% and the average of execution time was 353 seconds for completing the mission. Four main contributions included in this research: (1) find an efficient component of ERPs, N2P3, for BCI control, (2) embed robot's viewpoint image into user interface for robot control, (3) design an experimental scene and conduct the experiment, and (4) evaluate the performance of the proposed system for assessing the practicability.

Keywords: Brain-computer interface (BCI), event-related potentials (ERPs), robot control, severe physical disabilities.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2559
543 A Character Detection Method for Ancient Yi Books Based on Connected Components and Regressive Character Segmentation

Authors: Xu Han, Shanxiong Chen, Shiyu Zhu, Xiaoyu Lin, Fujia Zhao, Dingwang Wang

Abstract:

Character detection is an important issue for character recognition of ancient Yi books. The accuracy of detection directly affects the recognition effect of ancient Yi books. Considering the complex layout, the lack of standard typesetting and the mixed arrangement between images and texts, we propose a character detection method for ancient Yi books based on connected components and regressive character segmentation. First, the scanned images of ancient Yi books are preprocessed with nonlocal mean filtering, and then a modified local adaptive threshold binarization algorithm is used to obtain the binary images to segment the foreground and background for the images. Second, the non-text areas are removed by the method based on connected components. Finally, the single character in the ancient Yi books is segmented by our method. The experimental results show that the method can effectively separate the text areas and non-text areas for ancient Yi books and achieve higher accuracy and recall rate in the experiment of character detection, and effectively solve the problem of character detection and segmentation in character recognition of ancient books.

Keywords: Computing methodologies, interest point, salient region detections, image segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 793
542 Industrial Waste Monitoring

Authors: Khairuddin Bin Osman, Ngo Boon Kiat, A. Hamid Bin hamidon, Khairul Azha Bin A. Aziz, Hazli Rafis Bin Abdul Rahman, Mazran Bin Esro

Abstract:

Conventional industrial monitoring systems are tedious, inefficient and the at times integrity of the data is unreliable. The objective of this system is to monitor industrial processes specifically the fluid level which will measure the instantaneous fluid level parameter and respond by text messaging the exact value of the parameter to the user when being enquired by a privileged access user. The development of the embedded program code and the circuit for fluid level measuring are discussed as well. Suggestions for future implementations and efficient remote monitoring works are included.

Keywords: Industrial monitoring system, text messaging, embedded programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
541 AGHAZ : An Expert System Based approach for the Translation of English to Urdu

Authors: Uzair Muhammad, Kashif Bilal, Atif Khan, M. Nasir Khan

Abstract:

Machine Translation (MT 3) of English text to its Urdu equivalent is a difficult challenge. Lot of attempts has been made, but a few limited solutions are provided till now. We present a direct approach, using an expert system to translate English text into its equivalent Urdu, using The Unicode Standard, Version 4.0 (ISBN 0-321-18578-1) Range: 0600–06FF. The expert system works with a knowledge base that contains grammatical patterns of English and Urdu, as well as a tense and gender-aware dictionary of Urdu words (with their English equivalents).

Keywords: Machine Translation, Multiword Expressions, Urdulanguage processing, POS12 Tagging for Urdu, Expert Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309
540 National Image in the Age of Mass Self-Communication: An Analysis of Internet Users' Perception of Portugal

Authors: L. Godinho, N. Teixeira

Abstract:

Nowadays, massification of Internet access represents one of the major challenges to the traditional powers of the State, among which the power to control its external image. The virtual world has also sparked the interest of social sciences which consider it a new field of study, an immense open text where sense is expressed. In this paper, that immense text has been accessed to so as to understand the perception Internet users from all over the world have of Portugal. Ours is a quantitative and qualitative approach, as we have resorted to buzz, thematic and category analysis. The results confirm the predominance of sea stereotype in others' vision of the Portuguese people, and evidence that national image has adapted to network communication through processes of individuation and paganization.

Keywords: Internet, national image, perception, web analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1020
539 A New Recognition Scheme for Machine- Printed Arabic Texts based on Neural Networks

Authors: Z. Shaaban

Abstract:

This paper presents a new approach to tackle the problem of recognizing machine-printed Arabic texts. Because of the difficulty of recognizing cursive Arabic words, the text has to be normalized and segmented to be ready for the recognition stage. The new scheme for recognizing Arabic characters depends on multiple parallel neural networks classifier. The classifier has two phases. The first phase categories the input character into one of eight groups. The second phase classifies the character into one of the Arabic character classes in the group. The system achieved high recognition rate.

Keywords: Neural Networks, character recognition, feature extraction, multiple networks, Arabic text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438
538 The Challenges of Hyper-Textual Learning Approach for Religious Education

Authors: Elham Shirvani–Ghadikolaei, Seyed Mahdi Sajjadi

Abstract:

State of the art technology has the tremendous impact on our life, in this situation education system have been influenced as well as. In this paper, tried to compare two space of learning text and hypertext with each other, and some challenges of using hypertext in religious education. Regarding the fact that, hypertext is an undeniable part of learning in this world and it has highly beneficial for the education process from class to office and home. In this paper tried to solve this question: the consequences and challenges of applying hypertext in religious education. Also, the consequences of this survey demonstrate the role of curriculum designer and planner of education to solve this problem.

Keywords: Hyper-textual, education, religious text, religious education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356
537 A Study of the Variability of Very Low Resolution Characters and the Feasibility of Their Discrimination Using Geometrical Features

Authors: Farshideh Einsele, Rolf Ingold

Abstract:

Current OCR technology does not allow to accurately recognizing small text images, such as those found in web images. Our goal is to investigate new approaches to recognize very low resolution text images containing antialiased character shapes. This paper presents a preliminary study on the variability of such characters and the feasibility to discriminate them by using geometrical features. In a first stage we analyze the distribution of these features. In a second stage we present a study on the discriminative power for recognizing isolated characters, using various rendering methods and font properties. Finally we present interesting results of our evaluation tests leading to our conclusion and future focus.

Keywords: World Wide Web, document analysis, pattern recognition, Optical Character Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1334
536 Component-based Segmentation of Words from Handwritten Arabic Text

Authors: Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren, Stan S Ipson

Abstract:

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition.

Keywords: Arabic OCR, off-line recognition, Baseline estimation, Word segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
535 Emotions in Health Tweets: Analysis of American Government Official Accounts

Authors: García López

Abstract:

The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: Emotions in tweets emotion detection in text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 660
534 Techniques with Statistics for Web Page Watermarking

Authors: Mohamed Lahcen BenSaad, Sun XingMing

Abstract:

Information hiding, especially watermarking is a promising technique for the protection of intellectual property rights. This technology is mainly advanced for multimedia but the same has not been done for text. Web pages, like other documents, need a protection against piracy. In this paper, some techniques are proposed to show how to hide information in web pages using some features of the markup language used to describe these pages. Most of the techniques proposed here use the white space to hide information or some varieties of the language in representing elements. Experiments on a very small page and analysis of five thousands web pages show that these techniques have a wide bandwidth available for information hiding, and they might form a solid base to develop a robust algorithm for web page watermarking.

Keywords: Digital Watermarking, Information Hiding, Markup Language, Text watermarking, Software Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
533 Compression of Semistructured Documents

Authors: Leo Galambos, Jan Lansky, Katsiaryna Chernik

Abstract:

EGOTHOR is a search engine that indexes the Web and allows us to search the Web documents. Its hit list contains URL and title of the hits, and also some snippet which tries to shortly show a match. The snippet can be almost always assembled by an algorithm that has a full knowledge of the original document (mostly HTML page). It implies that the search engine is required to store the full text of the documents as a part of the index. Such a requirement leads us to pick up an appropriate compression algorithm which would reduce the space demand. One of the solutions could be to use common compression methods, for instance gzip or bzip2, but it might be preferable if we develop a new method which would take advantage of the document structure, or rather, the textual character of the documents. There already exist a special compression text algorithms and methods for a compression of XML documents. The aim of this paper is an integration of the two approaches to achieve an optimal level of the compression ratio

Keywords: Compression, search engine, HTML, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
532 A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

Authors: G. Shruthi, Purohit Shrinivasacharya

Abstract:

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

Keywords: Attribute, encryption, security, trapdoor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691
531 Event Information Extraction System (EIEE): FSM vs HMM

Authors: Shaukat Wasi, Zubair A. Shaikh, Sajid Qasmi, Hussain Sachwani, Rehman Lalani, Aamir Chagani

Abstract:

Automatic Extraction of Event information from social text stream (emails, social network sites, blogs etc) is a vital requirement for many applications like Event Planning and Management systems and security applications. The key information components needed from Event related text are Event title, location, participants, date and time. Emails have very unique distinctions over other social text streams from the perspective of layout and format and conversation style and are the most commonly used communication channel for broadcasting and planning events. Therefore we have chosen emails as our dataset. In our work, we have employed two statistical NLP methods, named as Finite State Machines (FSM) and Hidden Markov Model (HMM) for the extraction of event related contextual information. An application has been developed providing a comparison among the two methods over the event extraction task. It comprises of two modules, one for each method, and works for both bulk as well as direct user input. The results are evaluated using Precision, Recall and F-Score. Experiments show that both methods produce high performance and accuracy, however HMM was good enough over Title extraction and FSM proved to be better for Venue, Date, and time.

Keywords: Emails, Event Extraction, Event Detection, Finite state machines, Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2273
530 Designing Ontology-Based Knowledge Integration for Preprocessing of Medical Data in Enhancing a Machine Learning System for Coding Assignment of a Multi-Label Medical Text

Authors: Phanu Waraporn

Abstract:

This paper discusses the designing of knowledge integration of clinical information extracted from distributed medical ontologies in order to ameliorate a machine learning-based multilabel coding assignment system. The proposed approach is implemented using a decision tree technique of the machine learning on the university hospital data for patients with Coronary Heart Disease (CHD). The preliminary results obtained show a satisfactory finding that the use of medical ontologies improves the overall system performance.

Keywords: Medical Ontology, Knowledge Integration, Machine Learning, Medical Coding, Text Assignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
529 Urdu Nastaleeq Optical Character Recognition

Authors: Zaheer Ahmad, Jehanzeb Khan Orakzai, Inam Shamsher, Awais Adnan

Abstract:

This paper discusses the Urdu script characteristics, Urdu Nastaleeq and a simple but a novel and robust technique to recognize the printed Urdu script without a lexicon. Urdu being a family of Arabic script is cursive and complex script in its nature, the main complexity of Urdu compound/connected text is not its connections but the forms/shapes the characters change when it is placed at initial, middle or at the end of a word. The characters recognition technique presented here is using the inherited complexity of Urdu script to solve the problem. A word is scanned and analyzed for the level of its complexity, the point where the level of complexity changes is marked for a character, segmented and feeded to Neural Networks. A prototype of the system has been tested on Urdu text and currently achieves 93.4% accuracy on the average.

Keywords: Cursive Script, OCR, Urdu.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2730
528 Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
527 A Similarity Measure for Clustering and its Applications

Authors: Guadalupe J. Torres, Ram B. Basnet, Andrew H. Sung, Srinivas Mukkamala, Bernardete M. Ribeiro

Abstract:

This paper introduces a measure of similarity between two clusterings of the same dataset produced by two different algorithms, or even the same algorithm (K-means, for instance, with different initializations usually produce different results in clustering the same dataset). We then apply the measure to calculate the similarity between pairs of clusterings, with special interest directed at comparing the similarity between various machine clusterings and human clustering of datasets. The similarity measure thus can be used to identify the best (in terms of most similar to human) clustering algorithm for a specific problem at hand. Experimental results pertaining to the text categorization problem of a Portuguese corpus (wherein a translation-into-English approach is used) are presented, as well as results on the well-known benchmark IRIS dataset. The significance and other potential applications of the proposed measure are discussed.

Keywords: Clustering Algorithms, Clustering Applications, Similarity Measures, Text Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519
526 Alignment of e-Government Policy Formulation with Practical Implementation: The Case of Sub-Saharan Africa

Authors: W. Munyoka, F. M. Manzira

Abstract:

The purpose of this study is to analyze how varying alignment of e-Government policies in four countries in Sub-Saharan Africa Region, namely South Africa, Seychelles, Mauritius and Cape Verde lead to the success or failure of e-Government; and what should be done to ensure positive alignment that lead to e-Government project growth. In addition, the study aims to understand how various governments’ efforts in e-Government awareness campaign strategies, international cooperation, functional literacy and anticipated organizational change can influence implementation.

This study extensively explores contemporary research undertaken in the field of e-Government and explores the actual respective national ICT policies, strategies and implemented e-Government projects for in-depth comprehension of the status core. Data is analyzed qualitatively and quantitatively to reach a conclusion.

The study found that resounding successes in strategic e-Government alignment was achieved in Seychelles, Mauritius, South Africa and Cape Verde - (Ranked number 1 to 4 respectively).

The implications of the study is that policy makers in developing countries should put mechanisms in place for constant monitoring and evaluation of project implementation in line with ICT policies to ensure that e-Government projects reach maturity levels and do not die mid-way implementation as often noticed in many countries. The study recommends that countries within the region should make consented collaborative efforts and synergies with the private sector players and international donor agencies to achieve the implementation part of the set ICT policies.

Keywords: E-Government, ICT-Policy Alignment, Implementation, Sub-Saharan Africa.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2297
525 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: Document analysis, sentimental analysis, emotion detection, WEKA tool, NRC Lexicon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
524 Stego Machine – Video Steganography using Modified LSB Algorithm

Authors: Mritha Ramalingam

Abstract:

Computer technology and the Internet have made a breakthrough in the existence of data communication. This has opened a whole new way of implementing steganography to ensure secure data transfer. Steganography is the fine art of hiding the information. Hiding the message in the carrier file enables the deniability of the existence of any message at all. This paper designs a stego machine to develop a steganographic application to hide data containing text in a computer video file and to retrieve the hidden information. This can be designed by embedding text file in a video file in such away that the video does not loose its functionality using Least Significant Bit (LSB) modification method. This method applies imperceptible modifications. This proposed method strives for high security to an eavesdropper-s inability to detect hidden information.

Keywords: Data hiding, LSB, Stego machine, VideoSteganography

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4200
523 Affirming Students’ Attention and Perceptions on Prezi Presentation via Eye Tracking System

Authors: Mona Masood, Norshazlina Shaik Othman

Abstract:

The purpose of this study was to investigate graduate students’ visual attention and perceptions of a Prezi presentation. Ten postgraduate master students were presented with a Prezi presentation at the Centre for Instructional Technology and Multimedia, Universiti Sains Malaysia (USM). The eye movement indicators such as dwell time, average fixation on the areas of interests, heat maps and focus maps were abstracted to indicate the students’ visual attention. Descriptive statistics was employed to analyze the students’ perception of the Prezi presentation in terms of text, slide design, images, layout and overall presentation. The result revealed that the students paid more attention to the text followed by the images and sub heading presented through the Prezi presentation.

Keywords: Eye tracking, Prezi, visual attention, visual perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2279
522 A Recommender System Fusing Collaborative Filtering and User’s Review Mining

Authors: Seulbi Choi, Hyunchul Ahn

Abstract:

Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.

Keywords: Recommender system, collaborative filtering, text mining, review mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532