Search results for: knowledgediscovery in text
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 562

Search results for: knowledgediscovery in text

202 SMEs Access to Finance in Croatia – Model Approach

Authors: Vinko Vidučić, Ljiljana Vidučić, Damir Boras

Abstract:

The goals of the research include the determination of the characteristics of SMEs finance in Croatia, as well as the determination of indirect growth rates of the information model of the entrepreneurs` perception of business environment. The research results show that cost of finance and access to finance are most important constraining factor in setting up and running the business of small entrepreneurs in Croatia. Furthermore, small entrepreneurs in Croatia are significantly dissatisfied with the administrative barriers although relatively to a lesser extent than was the case in the pre crisis time. High collateral requirement represents the main characteristic of bank lending concerning SMEs followed by long credit elaboration process. Formulated information model has defined the individual impact of indirect growth rates of the remaining variables on the model’s specific variable.

Keywords: Business environment, information model, indirect growth rates, SME finance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153
201 Process Oriented Architecture for Emergency Scenarios in the Czech Republic

Authors: Tomáš Ludík, Josef Navrátil, Alena Langerová

Abstract:

Tackling emergency situations is performed based on emergency scenarios. These scenarios do not have a uniform form in the Czech Republic. They are unstructured and developed primarily in the text form. This does not allow solving emergency situations efficiently. For this reason, the paper aims at defining a Process Oriented Architecture to support and thus to improve tackling emergency situations in the Czech Republic. The innovative Process Oriented Architecture is based on the Workflow Reference Model while taking into account the options of Business Process Management Suites for the implementation of process oriented emergency scenarios. To verify the proposed architecture the Proof of Concept has been used which covers the reception of an emergency event at the district emergency operations centre. Within the particular implementation of the proposed architecture the Bonita Open Solution has been used. The architecture created in this way is suitable not only for emergency management, but also for educational purposes.

Keywords: Business Process Management Suite, Czech Republic, Emergency Scenarios, Process Execution, Process Oriented Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
200 Exploring Pisa Monuments Using Mobile Augmented Reality

Authors: Mihai Duguleana, Florin Girbacia, Cristian Postelnicu, Raffaello Brodi, Marcello Carrozzino

Abstract:

Augmented Reality (AR) has taken a big leap with the introduction of mobile applications which co-locate bi-dimensional (e.g. photo, video, text) and tridimensional information with the location of the user enriching his/her experience. This study presents the advantages of using Mobile Augmented Reality (MAR) technologies in traveling applications, improving cultural heritage exploration. We propose a location-based AR application which combines co-location with the augmented visual information about Pisa monuments to establish a friendly navigation in this historic city. AR was used to render contextual visual information in the outdoor environment. The developed Android-based application offers two different options: it provides the ability to identify the monuments positioned close to the user’s position and it offers location information for getting near the key touristic objectives. We present the process of creating the monuments’ 3D map database and the navigation algorithm.

Keywords: Augmented reality, electronic compass, GPS, location-based service.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1694
199 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1798
198 Adaptive Naïve Bayesian Anti-Spam Engine

Authors: Wojciech P. Gajewski

Abstract:

The problem of spam has been seriously troubling the Internet community during the last few years and currently reached an alarming scale. Observations made at CERN (European Organization for Nuclear Research located in Geneva, Switzerland) show that spam mails can constitute up to 75% of daily SMTP traffic. A naïve Bayesian classifier based on a Bag Of Words representation of an email is widely used to stop this unwanted flood as it combines good performance with simplicity of the training and classification processes. However, facing the constantly changing patterns of spam, it is necessary to assure online adaptability of the classifier. This work proposes combining such a classifier with another NBC (naïve Bayesian classifier) based on pairs of adjacent words. Only the latter will be retrained with examples of spam reported by users. Tests are performed on considerable sets of mails both from public spam archives and CERN mailboxes. They suggest that this architecture can increase spam recall without affecting the classifier precision as it happens when only the NBC based on single words is retrained.

Keywords: Text classification, naïve Bayesian classification, spam, email.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4415
197 Automatic Rearrangement of Localized Graphical User Interface

Authors: Ágoston Winkler, Sándor Juhász

Abstract:

The localization of software products is essential for reaching the users of the international market. An important task for this is the translation of the user interface into local national languages. As graphical interfaces are usually optimized for the size of the texts in the original language, after the translation certain user controls (e.g. text labels and buttons in dialogs) may grow in such a manner that they slip above each other. This not only causes an unpleasant appearance but also makes the use of the program more difficult (or even impossible) which implies that the arrangement of the controls must be corrected subsequently. The correction should preserve the original structure of the interface (e.g. the relation of logically coherent controls), furthermore, it is important to keep the nicely proportioned design: the formation of large empty areas should be avoided. This paper describes an algorithm that automatically rearranges the controls of a graphical user interface based on the principles above. The algorithm has been implemented and integrated into a translation support system and reached results pleasant for the human eye in most test cases.

Keywords: Graphical user interface, GUI, natural languages, software localization, translation support systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
196 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines

Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma

Abstract:

Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.

Keywords: Road accident, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129
195 Disclosing the Relationship among CO2 Emissions, Energy Consumption, Economic Growth and Bilateral Trade between Singapore and Malaysia: An Econometric Analysis

Authors: H. A. Bekhet, T. Yasmin

Abstract:

The aim of this paper is to examine the relationship among CO2 per capita emissions, energy consumption, economic growth and bilateral trade between Singapore and Malaysia for the 1970-2011 period. ARDL model and Granger causality tests are employed for the analysis.  Results of bound F-statistics suggest that long-run  relationship exists between CO2 per capita (PCO2) and its determinants. The EKC hypothesis is not supported in Malaysia. Carbon emissions are mainly determined by energy consumption in the short and long run. While, exports to Singapore is a significant variable in explaining PCO2 emissions in Malaysia in long-run. Furthermore, we find a unidirectional causal relationship running from economic growth to PCO2 emissions.

Keywords: ADRL Bound Test, Bilateral trade, CO2 emission, Environmental Kuznets Curve, Energy consumption, Malaysia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2651
194 A Tree Based Association Rule Approach for XML Data with Semantic Integration

Authors: D. Sasikala, K. Premalatha

Abstract:

The use of eXtensible Markup Language (XML) in web, business and scientific databases lead to the development of methods, techniques and systems to manage and analyze XML data. Semi-structured documents suffer due to its heterogeneity and dimensionality. XML structure and content mining represent convergence for research in semi-structured data and text mining. As the information available on the internet grows drastically, extracting knowledge from XML documents becomes a harder task. Certainly, documents are often so large that the data set returned as answer to a query may also be very big to convey the required information. To improve the query answering, a Semantic Tree Based Association Rule (STAR) mining method is proposed. This method provides intentional information by considering the structure, content and the semantics of the content. The method is applied on Reuter’s dataset and the results show that the proposed method outperforms well.

Keywords: Semi--structured Document, Tree based Association Rule (TAR), Semantic Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2352
193 Verification and Application of Finite Element Model Developed for Flood Routing in Rivers

Authors: A. L. Qureshi, A. A. Mahessar, A. Baloch

Abstract:

Flood wave propagation in river channel flow can be enunciated by nonlinear equations of motion for unsteady flow. It is difficult to find analytical solution of these non-linear equations. Hence, in this paper verification of the finite element model has been carried out against available numerical predictions and field data. The results of the model indicate a good matching with both Preissmann scheme and HEC-RAS model for a river reach of 29km at both sites (15km from upstream and at downstream end) for discharge hydrographs. It also has an agreeable comparison with the Preissemann scheme for the flow depth (stage) hydrographs. The proposed model has also been applying to forecast daily discharges at 400km downstream in the Indus River from Sukkur barrage of Sindh, Pakistan, which demonstrates accurate model predictions with observed the daily discharges. Hence, this model may be utilized for flood warnings in advance.

Keywords: Finite Element Method, Flood Forecasting, HEC-RAS, Indus river.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2685
192 Determinants of Knowledge-Based Improving Workflow and Communication within Surgical Team

Authors: J. Bartnicka

Abstract:

Surgical team consists of variety types of medical specialists possessing different kind of knowledge, motivations, personalities or abilities. This together with poor knowledge transfer, lack of information and communication technologies (ICT) implementations in hospitals can cause protraction of patient care processes and even jeopardize patient safety. There is presented in the article the outcomes of studies on communication and workflow in surgical team in the background of different collaboration levels in healthcare system. As a result the five determinants of improving workflow and communication within surgical team were indicated as well as knowledge-based tools and supporting information technology were proposed.

Keywords: Knowledge transfer, absorption abilities, knowledge representation, information and communication technologies, cooperation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
191 A Cheating Model for Cellular Automata-Based Secret Sharing Schemes

Authors: Borna Jafarpour, Azadeh Nematzadeh, Vahid Kazempour, Babak Sadeghian

Abstract:

Cellular automata have been used for design of cryptosystems. Recently some secret sharing schemes based on linear memory cellular automata have been introduced which are used for both text and image. In this paper, we illustrate that these secret sharing schemes are vulnerable to dishonest participants- collusion. We propose a cheating model for the secret sharing schemes based on linear memory cellular automata. For this purpose we present a novel uniform model for representation of all secret sharing schemes based on cellular automata. Participants can cheat by means of sending bogus shares or bogus transition rules. Cheaters can cooperate to corrupt a shared secret and compute a cheating value added to it. Honest participants are not aware of cheating and suppose the incorrect secret as the valid one. We prove that cheaters can recover valid secret by removing the cheating value form the corrupted secret. We provide methods of calculating the cheating value.

Keywords: Cellular automata, cheating model, secret sharing, threshold scheme.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1586
190 The Attitudes of Pre-Service Teachers towards Analytical Thinking Skill Development Based On Miller’s Model

Authors: Thassanant Unnanantn, Suttipong Boonphadung

Abstract:

This research study aimed to survey and analyze the attitudes of pre-service teachers’ the analytical thinking development based on Miller’s Model. The informants of this study were 22 third year teacher students majoring in Thai. The course where the instruction was conducted was English for Academic Purposes in Thai Language 2. The instrument of this research was an open-ended questionnaire with two dimensions of questions: academic and satisfaction dimensions. The investigation revealed the positive attitudes. In the academic dimension, the majority of 12 (54.54%), the highest percentage, reflected that the method of teaching analytical thinking and language simultaneously was their new knowledge and the similar percentage also belonged to text cohesion in writing. For the satisfaction, the highest frequency count was from 17 of them (77.27%) and this majority favored the openness or friendliness of the teacher.

Keywords: Analytical thinking development, Attitudes, Miller’s Model, Pre-service teachers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
189 A Study of Growth Factors on Sustainable Manufacturing in Small and Medium-Sized Enterprises: Case Study of Japan Manufacturing

Authors: Tadayuki Kyoutani, Shigeyuki Haruyama, Ken Kaminishi, Zefry Darmawan

Abstract:

Japan’s semiconductor industries have developed greatly in recent years. Many were started from a Small and Medium-sized Enterprises (SMEs) that found at a good circumstance and now become the prosperous industries in the world. Sustainable growth factors that support the creation of spirit value inside the Japanese company were strongly embedded through performance. Those factors were not clearly defined among each company. A series of literature research conducted to explore quantitative text mining about the definition of sustainable growth factors. Sustainable criteria were developed from previous research to verify the definition of the factors. A typical frame work was proposed as a systematical approach to develop sustainable growth factor in a specific company. Result of approach was review in certain period shows that factors influenced in sustainable growth was importance for the company to achieve the goal.

Keywords: SME, manufacture, sustainable, growth factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 635
188 Energy-Level Structure of a Confined Electron-Positron Pair in Nanostructure

Authors: Tokuei Sako, Paul-Antoine Hervieux

Abstract:

The energy-level structure of a pair of electron and positron confined in a quasi-one-dimensional nano-scale potential well has been investigated focusing on its trend in the small limit of confinement strength ω, namely, the Wigner molecular regime. An anisotropic Gaussian-type basis functions supplemented by high angular momentum functions as large as l = 19 has been used to obtain reliable full configuration interaction (FCI) wave functions. The resultant energy spectrum shows a band structure characterized by ω for the large ω regime whereas for the small ω regime it shows an energy-level pattern dominated by excitation into the in-phase motion of the two particles. The observed trend has been rationalized on the basis of the nodal patterns of the FCI wave functions. 

Keywords: Confined systems, positron, wave function, Wigner molecule, quantum dots.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853
187 Player Number Localization and Recognition in Soccer Video using HSV Color Space and Internal Contours

Authors: Matko Šaric, Hrvoje Dujmic, Vladan Papic, Nikola Rožic

Abstract:

Detection of player identity is challenging task in sport video content analysis. In case of soccer video player number recognition is effective and precise solution. Jersey numbers can be considered as scene text and difficulties in localization and recognition appear due to variations in orientation, size, illumination, motion etc. This paper proposed new method for player number localization and recognition. By observing hue, saturation and value for 50 different jersey examples we noticed that most often combination of low and high saturated pixels is used to separate number and jersey region. Image segmentation method based on this observation is introduced. Then, novel method for player number localization based on internal contours is proposed. False number candidates are filtered using area and aspect ratio. Before OCR processing extracted numbers are enhanced using image smoothing and rotation normalization.

Keywords: player number, soccer video, HSV color space

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987
186 Flexible Development and Calculation of Contract Logistics Services

Authors: T. Spiegel, J. Siegmann, C. F. Durach

Abstract:

Challenges resulting from an international and dynamic business environment are increasingly being passed on from manufacturing companies to external service providers. Especially providers of complex, customer-specific industry services have to cope with continuously changing requirements. This is particularly true for contract logistics service providers. They are forced to develop efficient and highly flexible structures and strategies to meet their customer’s needs. One core element they have to focus on is the reorganization of their service development and sales process. Based on an action research approach, this study develops and tests a concept to streamline tender management for contract logistics service providers. The concept of modularized service architecture is deployed in order to derive a practice-oriented approach for the modularization of complex service portfolios and the design of customized quotes. These findings are evaluated regarding their applicability in other service sectors and practical recommendations are given.

Keywords: Contract Logistics, Modularization, Service Development, Tender Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2071
185 Specification of Attributes of a Multimedia Presentation for Presentation Manager

Authors: Veli Hakkoymaz, Alpaslan Altunköprü

Abstract:

A multimedia presentation system refers to the integration of a multimedia database with a presentation manager which has the functionality of content selection, organization and playout of multimedia presentations. It requires high performance of involved system components. Starting from multimedia information capture until the presentation delivery, high performance tools are required for accessing, manipulating, storing and retrieving these segments, for transferring and delivering them in a presentation terminal according to a playout order. The organization of presentations is a complex task in that the display order of presentation contents (in time and space) must be specified. A multimedia presentation contains audio, video, images and text media types. The critical decisions for presentation construction include what the contents are, how the contents are organized, and once the decision is made on the organization of the contents of the presentation, it must be conveyed to the end user in the correct organizational order and in a timely fashion. This paper introduces a framework for specification of multimedia presentations and describes the design of sample presentations using this framework from a multimedia database.

Keywords: Multimedia presentation, temporal specification, SMIL, spatial specification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814
184 Multiplayer RC-Car Driving System in a Collaborative Augmented Reality Environment

Authors: Kikuo Asai, Yuji Sugimoto

Abstract:

We developed a prototype system for multiplayer RC-car driving in a collaborative augmented reality (AR) environment. The tele-existence environment is constructed by superimposing digital data onto images captured by a camera on an RC-car, enabling players to experience an augmented coexistence of the digital content and the real world. Marker-based tracking was used for estimating position and orientation of the camera. The plural RC-cars can be operated in a field where square markers are arranged. The video images captured by the camera are transmitted to a PC for visual tracking. The RC-cars are also tracked by using an infrared camera attached to the ceiling, so that the instability is reduced in the visual tracking. Multimedia data such as texts and graphics are visualized to be overlaid onto the video images in the geometrically correct manner. The prototype system allows a tele-existence sensation to be augmented in a collaborative AR environment.

Keywords: Multiplayer, RC-car, Collaborative Environment, Augmented Reality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2068
183 Providing a Secure Hybrid Method for Graphical Password Authentication to Prevent Shoulder Surfing, Smudge and Brute Force Attack

Authors: Faraji Sepideh

Abstract:

Nowadays, purchase rate of the smart device is increasing and user authentication is one of the important issues in information security. Alphanumeric strong passwords are difficult to memorize and also owners write them down on papers or save them in a computer file. In addition, text password has its own flaws and is vulnerable to attacks. Graphical password can be used as an alternative to alphanumeric password that users choose images as a password. This type of password is easier to use and memorize and also more secure from pervious password types. In this paper we have designed a more secure graphical password system to prevent shoulder surfing, smudge and brute force attack. This scheme is a combination of two types of graphical passwords recognition based and Cued recall based. Evaluation the usability and security of our proposed scheme have been explained in conclusion part.

Keywords: Brute force attack, graphical password, shoulder surfing attack, smudge attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 913
182 Slug Initiation Evaluation in Long Horizontal Channels Experimentally

Authors: P. Adibi, M. R. Ansari, S. Jafari, B. Habibpour, E. Salimi

Abstract:

In this paper, the effect of gas and liquid superficial inlet velocities and for the first time the effect of liquid holdup on slug initiation position are studied experimentally. Empirical correlations are also presented based on the obtained results. The tests are conducted for three liquid holdups in a long horizontal channel with dimensions of 5cm10cm and 36m length. Usl and Usg rated as to 0.11m/s to 0.56m/s and 1.88m/s to 13m/s, respectively. The obtained results show that as αl=0.25, slug initiation position is increasing monotonically with Usl and Usg. During αl=0.50, slug initiation position is almost constant. For αl=0.75, slug initiation position is decreasing monotonically with Usl and Usg. In the case of equal void fraction of phases, generated slugs are weakly (low pressure). However, for the unequal void fraction of phases strong slugs (high pressure) are formed.

Keywords: Liquid holdup, Long horizontal channel, Slug initiation position, Superficial inlet velocity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1868
181 GPS Devices to Increase Efficiency of Indian Auto-Rickshaw Segment

Authors: Sanchay Vaidya, Sourabh Gupta, Gouresh Singhal

Abstract:

There are various modes of transport in metro cities in India, auto-rickshaws being one of them. Auto-rickshaws provide connectivity to all the places in the city offering last mile connectivity. Among all the modes of transport the auto-rickshaw industry is the most unorganized and inefficient. Although unions exist in different cities they aren’t good enough to cope up with the upcoming advancements in the field of technology. An introduction of simple technology in this field may do wonders and help increase the revenues. This paper aims to organize this segment under a single umbrella using GPS devices and mobile phones. The paper includes surveys of about 300 auto-rickshaw drivers and 1000 plus commuters across 6 metro cities in India. Carrying out research and analysis provides a base for the development of this model and implementation of this innovative technique, which is discussed in this paper in detail with ample emphasis given on the implementation of this model.

Keywords: Auto-rickshaws, Business Model, GPS device, Mobile application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3385
180 Retaining Users in a Commercially-Supported Social Network

Authors: Sasiphan Nitayaprapha

Abstract:

A commercially-supported social network has become an emerging channel for an organization to communicate with and provide services to customers. The success of the commercially-supported social network depends on the ability of the organization to keep the customers in participating in the network. Drawing from the theories of information adoption, information systems continuance, and web usability, the author develops a model to explore how a commercially-supported social network can encourage customers to continue participating and using the information in the network. The theoretical model will be proved through an online survey of customers using the commercially-supported social networking sites of several high technology companies operating in the same sector. The result will be compared with previous studies to learn about the explanatory power of the research model, and to identify the main factors determining users’ intention to continue using a commercially-supported social network. Theoretical and practical implications and limitations are discussed.

Keywords: Social network, Information adoption, Information systems continuance, Web usability, User satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895
179 Data Gathering and Analysis for Arabic Historical Documents

Authors: Ali Dulla

Abstract:

This paper introduces a new dataset (and the methodology used to generate it) based on a wide range of historical Arabic documents containing clean data simple and homogeneous-page layouts. The experiments are implemented on printed and handwritten documents obtained respectively from some important libraries such as Qatar Digital Library, the British Library and the Library of Congress. We have gathered and commented on 150 archival document images from different locations and time periods. It is based on different documents from the 17th-19th century. The dataset comprises differing page layouts and degradations that challenge text line segmentation methods. Ground truth is produced using the Aletheia tool by PRImA and stored in an XML representation, in the PAGE (Page Analysis and Ground truth Elements) format. The dataset presented will be easily available to researchers world-wide for research into the obstacles facing various historical Arabic documents such as geometric correction of historical Arabic documents.

Keywords: Dataset production, ground truth production, historical documents, arbitrary warping, geometric correction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 866
178 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts

Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah

Abstract:

Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.

Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2254
177 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920
176 Architecture of Speech-based Registration System

Authors: Mayank Kumar, D B Mahesh Kumar, Ashwin S Kumar, N K Srinath

Abstract:

In this era of technology, fueled by the pervasive usage of the internet, security is a prime concern. The number of new attacks by the so-called “bots", which are automated programs, is increasing at an alarming rate. They are most likely to attack online registration systems. Technology, called “CAPTCHA" (Completely Automated Public Turing test to tell Computers and Humans Apart) do exist, which can differentiate between automated programs and humans and prevent replay attacks. Traditionally CAPTCHA-s have been implemented with the challenge involved in recognizing textual images and reproducing the same. We propose an approach where the visual challenge has to be read out from which randomly selected keywords are used to verify the correctness of spoken text and in turn detect the presence of human. This is supplemented with a speaker recognition system which can identify the speaker also. Thus, this framework fulfills both the objectives – it can determine whether the user is a human or not and if it is a human, it can verify its identity.

Keywords: CAPTCHA, automatic speech recognition, keyword spotting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547
175 Development of a Serial Signal Monitoring Program for Educational Purposes

Authors: Jungho Moon, Lae-Jeong Park

Abstract:

This paper introduces a signal monitoring program developed with a view to helping electrical engineering students get familiar with sensors with digital output. Because the output of digital sensors cannot be simply monitored by a measuring instrument such as an oscilloscope, students tend to have a hard time dealing with digital sensors. The monitoring program runs on a PC and communicates with an MCU that reads the output of digital sensors via an asynchronous communication interface. Receiving the sensor data from the MCU, the monitoring program shows time and/or frequency domain plots of the data in real time. In addition, the monitoring program provides a serial terminal that enables the user to exchange text information with the MCU while the received data is plotted. The user can easily observe the output of digital sensors and configure the digital sensors in real time, which helps students who do not have enough experiences with digital sensors. Though the monitoring program was programmed in the Matlab programming language, it runs without the Matlab since it was compiled as a standalone executable.

Keywords: Digital sensor, MATLAB, MCU, signal monitoring program.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2115
174 Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

Authors: Mohamed Ali KAMMOUN, Ahmed Ben HAMIDA

Abstract:

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Keywords: Unit selection, Corpus-based Speech Synthesis, Bigram model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
173 Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Authors: T. Kumaresan, S. Sanjushree, C. Palanisamy

Abstract:

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Keywords: File Type, HSV Histogram, k-NN, RGB Histogram, Spam Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2142