Search results for: Parallel text alignment
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1239

Search results for: Parallel text alignment

969 Face Localization and Recognition in Varied Expressions and Illumination

Authors: Hui-Yu Huang, Shih-Hang Hsu

Abstract:

In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.

Keywords: Gabor filter, improved active shape model (IASM), principal component analysis (PCA), face alignment, face recognition, support vector machine (SVM)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451
968 Solving 94-bit ECDLP with 70 Computers in Parallel

Authors: Shunsuke Miyoshi, Yasuyuki Nogami, Takuya Kusaka, Nariyoshi Yamai

Abstract:

Elliptic curve discrete logarithm problem(ECDLP) is one of problems on which the security of pairing-based cryptography is based. This paper considers Pollard’s rho method to evaluate the security of ECDLP on Barreto-Naehrig(BN) curve that is an efficient pairing-friendly curve. Some techniques are proposed to make the rho method efficient. Especially, the group structure on BN curve, distinguished point method, and Montgomery trick are well-known techniques. This paper applies these techniques and shows its optimization. According to the experimental results for which a large-scale parallel system with MySQL is applied, 94-bit ECDLP was solved about 28 hours by parallelizing 71 computers.

Keywords: Pollard’s rho method, BN curve, Montgomery multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1818
967 A Character Detection Method for Ancient Yi Books Based on Connected Components and Regressive Character Segmentation

Authors: Xu Han, Shanxiong Chen, Shiyu Zhu, Xiaoyu Lin, Fujia Zhao, Dingwang Wang

Abstract:

Character detection is an important issue for character recognition of ancient Yi books. The accuracy of detection directly affects the recognition effect of ancient Yi books. Considering the complex layout, the lack of standard typesetting and the mixed arrangement between images and texts, we propose a character detection method for ancient Yi books based on connected components and regressive character segmentation. First, the scanned images of ancient Yi books are preprocessed with nonlocal mean filtering, and then a modified local adaptive threshold binarization algorithm is used to obtain the binary images to segment the foreground and background for the images. Second, the non-text areas are removed by the method based on connected components. Finally, the single character in the ancient Yi books is segmented by our method. The experimental results show that the method can effectively separate the text areas and non-text areas for ancient Yi books and achieve higher accuracy and recall rate in the experiment of character detection, and effectively solve the problem of character detection and segmentation in character recognition of ancient books.

Keywords: Computing methodologies, interest point, salient region detections, image segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 793
966 Industrial Waste Monitoring

Authors: Khairuddin Bin Osman, Ngo Boon Kiat, A. Hamid Bin hamidon, Khairul Azha Bin A. Aziz, Hazli Rafis Bin Abdul Rahman, Mazran Bin Esro

Abstract:

Conventional industrial monitoring systems are tedious, inefficient and the at times integrity of the data is unreliable. The objective of this system is to monitor industrial processes specifically the fluid level which will measure the instantaneous fluid level parameter and respond by text messaging the exact value of the parameter to the user when being enquired by a privileged access user. The development of the embedded program code and the circuit for fluid level measuring are discussed as well. Suggestions for future implementations and efficient remote monitoring works are included.

Keywords: Industrial monitoring system, text messaging, embedded programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
965 Using Artificial Neural Network to Predict Collisions on Horizontal Tangents of 3D Two-Lane Highways

Authors: Omer F. Cansiz, Said M. Easa

Abstract:

The purpose of this study is mainly to predict collision frequency on the horizontal tangents combined with vertical curves using artificial neural network methods. The proposed ANN models are compared with existing regression models. First, the variables that affect collision frequency were investigated. It was found that only the annual average daily traffic, section length, access density, the rate of vertical curvature, smaller curve radius before and after the tangent were statistically significant according to related combinations. Second, three statistical models (negative binomial, zero inflated Poisson and zero inflated negative binomial) were developed using the significant variables for three alignment combinations. Third, ANN models are developed by applying the same variables for each combination. The results clearly show that the ANN models have the lowest mean square error value than those of the statistical models. Similarly, the AIC values of the ANN models are smaller to those of the regression models for all the combinations. Consequently, the ANN models have better statistical performances than statistical models for estimating collision frequency. The ANN models presented in this paper are recommended for evaluating the safety impacts 3D alignment elements on horizontal tangents.

Keywords: Collision frequency, horizontal tangent, 3D two-lane highway, negative binomial, zero inflated Poisson, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600
964 An Ultra-Low Output Impedance Power Amplifier for Tx Array in 7-Tesla Magnetic Resonance Imaging

Authors: Ashraf Abuelhaija, Klaus Solbach

Abstract:

In Ultra high-field MRI scanners (3T and higher), parallel RF transmission techniques using multiple RF chains with multiple transmit elements are a promising approach to overcome the high-field MRI challenges in terms of inhomogeneity in the RF magnetic field and SAR. However, mutual coupling between the transmit array elements disturbs the desirable independent control of the RF waveforms for each element. This contribution demonstrates a 18 dB improvement of decoupling (isolation) performance due to the very low output impedance of our 1 kW power amplifier.

Keywords: EM coupling, Inter-element isolation, Magnetic resonance imaging (MRI), Parallel Transmit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
963 AGHAZ : An Expert System Based approach for the Translation of English to Urdu

Authors: Uzair Muhammad, Kashif Bilal, Atif Khan, M. Nasir Khan

Abstract:

Machine Translation (MT 3) of English text to its Urdu equivalent is a difficult challenge. Lot of attempts has been made, but a few limited solutions are provided till now. We present a direct approach, using an expert system to translate English text into its equivalent Urdu, using The Unicode Standard, Version 4.0 (ISBN 0-321-18578-1) Range: 0600–06FF. The expert system works with a knowledge base that contains grammatical patterns of English and Urdu, as well as a tense and gender-aware dictionary of Urdu words (with their English equivalents).

Keywords: Machine Translation, Multiword Expressions, Urdulanguage processing, POS12 Tagging for Urdu, Expert Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309
962 Enhanced Parallel-Connected Comb Filter Method for Multiple Pitch Estimation

Authors: Taro Matsuno, Yuta Otani, Ryo Tanaka, Kaori Ikezaki, Hitoshi Yamamoto, Masaru Fujieda, Yoshihisa Ishida

Abstract:

This paper presents an improvement method of the multiple pitch estimation algorithm using comb filters. Conventionally the pitch was estimated by using parallel -connected comb filters method (PCF). However, PCF has problems which often fail in the pitch estimation when there is the fundamental frequency of higher tone near harmonics of lower tone. Therefore the estimation is assigned to a wrong note when shared frequencies happen. This issue often occurs in estimating octave 3 or more. Proposed method, for solving the problem, estimates the pitch with every harmonic instead of every octave. As a result, our method reaches the accuracy of more than 80%.

Keywords: music transcription, pitch estimation, comb filter, fractional delay

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372
961 Performance Analysis of Load Balancing Algorithms

Authors: Sandeep Sharma, Sarabjit Singh, Meenakshi Sharma

Abstract:

Load balancing is the process of improving the performance of a parallel and distributed system through a redistribution of load among the processors [1] [5]. In this paper we present the performance analysis of various load balancing algorithms based on different parameters, considering two typical load balancing approaches static and dynamic. The analysis indicates that static and dynamic both types of algorithm can have advancements as well as weaknesses over each other. Deciding type of algorithm to be implemented will be based on type of parallel applications to solve. The main purpose of this paper is to help in design of new algorithms in future by studying the behavior of various existing algorithms.

Keywords: Load balancing (LB), workload, distributed systems, Static Load balancing, Dynamic Load Balancing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5867
960 A New Load Frequency Controller based on Parallel Fuzzy PI with Conventional PD (FPI-PD)

Authors: Aqeel S. Jaber, Abu Zaharin Ahmad, Ahmed N. Abdalla

Abstract:

The artificial intelligent controller in power system plays as most important rule for many applications such as system operation and its control specially Load Frequency Controller (LFC). The main objective of LFC is to keep the frequency and tie-line power close to their decidable bounds in case of disturbance. In this paper, parallel fuzzy PI adaptive with conventional PD technique for Load Frequency Control system was proposed. PSO optimization method used to optimize both of scale fuzzy PI and tuning of PD. Two equal interconnected power system areas were used as a test system. Simulation results show the effectiveness of the proposed controller compared with different PID and classical fuzzy PI controllers in terms of speed response and damping frequency.

Keywords: Load frequency control, PSO, fuzzy control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1986
959 A Novel Methodology Proposed for Optimizing the Degree of Hybridization in Parallel HEVs using Genetic Algorithm

Authors: K. Varesi, A. Radan

Abstract:

In this paper, a new Genetic Algorithm (GA) based methodology is proposed to optimize the Degree of Hybridization (DOH) in a passenger parallel hybrid car. At first step, target parameters for the vehicle are decided and then using ADvanced VehIcle SimulatOR (ADVISOR) software, the variation pattern of these target parameters, across the different DOHs, is extracted. At the next step, a suitable cost function is defined and is optimized using GA. In this paper, also a new technique has been proposed for deciding the number of battery modules for each DOH, which leads to a great improvement in the vehicle performance. The proposed methodology is so simple, fast and at the same time, so efficient.

Keywords: Degree of Hybridization (DOH), Electric Motor, Emissions, Fuel Economy, Genetic Algorithm (GA), Hybrid ElectricVehicle (HEV), Vehicle Performance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801
958 National Image in the Age of Mass Self-Communication: An Analysis of Internet Users' Perception of Portugal

Authors: L. Godinho, N. Teixeira

Abstract:

Nowadays, massification of Internet access represents one of the major challenges to the traditional powers of the State, among which the power to control its external image. The virtual world has also sparked the interest of social sciences which consider it a new field of study, an immense open text where sense is expressed. In this paper, that immense text has been accessed to so as to understand the perception Internet users from all over the world have of Portugal. Ours is a quantitative and qualitative approach, as we have resorted to buzz, thematic and category analysis. The results confirm the predominance of sea stereotype in others' vision of the Portuguese people, and evidence that national image has adapted to network communication through processes of individuation and paganization.

Keywords: Internet, national image, perception, web analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1020
957 The Challenges of Hyper-Textual Learning Approach for Religious Education

Authors: Elham Shirvani–Ghadikolaei, Seyed Mahdi Sajjadi

Abstract:

State of the art technology has the tremendous impact on our life, in this situation education system have been influenced as well as. In this paper, tried to compare two space of learning text and hypertext with each other, and some challenges of using hypertext in religious education. Regarding the fact that, hypertext is an undeniable part of learning in this world and it has highly beneficial for the education process from class to office and home. In this paper tried to solve this question: the consequences and challenges of applying hypertext in religious education. Also, the consequences of this survey demonstrate the role of curriculum designer and planner of education to solve this problem.

Keywords: Hyper-textual, education, religious text, religious education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356
956 Improved Pattern Matching Applied to Surface Mounting Devices Components Localization on Automated Optical Inspection

Authors: Pedro M. A. Vitoriano, Tito. G. Amaral

Abstract:

Automated Optical Inspection (AOI) Systems are commonly used on Printed Circuit Boards (PCB) manufacturing. The use of this technology has been proven as highly efficient for process improvements and quality achievements. The correct extraction of the component for posterior analysis is a critical step of the AOI process. Nowadays, the Pattern Matching Algorithm is commonly used, although this algorithm requires extensive calculations and is time consuming. This paper will present an improved algorithm for the component localization process, with the capability of implementation in a parallel execution system.

Keywords: AOI, automated optical inspection, SMD, surface mounting devices, pattern matching, parallel execution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1023
955 Improvement of Parallel Compressor Model in Dealing Outlet Unequal Pressure Distribution

Authors: Kewei Xu, Jens Friedrich, Kevin Dwinger, Wei Fan, Xijin Zhang

Abstract:

Parallel Compressor Model (PCM) is a simplified approach to predict compressor performance with inlet distortions. In PCM calculation, it is assumed that the sub-compressors’ outlet static pressure is uniform and therefore simplifies PCM calculation procedure. However, if the compressor’s outlet duct is not long and straight, such assumption frequently induces error ranging from 10% to 15%. This paper provides a revised calculation method of PCM that can correct the error. The revised method employs energy equation, momentum equation and continuity equation to acquire needed parameters and replace the equal static pressure assumption. Based on the revised method, PCM is applied on two compression system with different blades types. The predictions of their performance in non-uniform inlet conditions are yielded through the revised calculation method and are employed to evaluate the method’s efficiency. Validating the results by experimental data, it is found that although little deviation occurs, calculated result agrees well with experiment data whose error ranges from 0.1% to 3%. Therefore, this proves the revised calculation method of PCM possesses great advantages in predicting the performance of the distorted compressor with limited exhaust duct.

Keywords: Parallel Compressor Model (PCM), Revised Calculation Method, Inlet Distortion, Outlet Unequal Pressure Distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
954 Reliability Analysis of k-out-of-n : G System Using Triangular Intuitionistic Fuzzy Numbers

Authors: Tanuj Kumar, Rakesh Kumar Bajaj

Abstract:

In the present paper, we analyze the vague reliability of k-out-of-n : G system (particularly, series and parallel system) with independent and non-identically distributed components, where the reliability of the components are unknown. The reliability of each component has been estimated using statistical confidence interval approach. Then we converted these statistical confidence interval into triangular intuitionistic fuzzy numbers. Based on these triangular intuitionistic fuzzy numbers, the reliability of the k-out-of-n : G system has been calculated. Further, in order to implement the proposed methodology and to analyze the results of k-out-of-n : G system, a numerical example has been provided.

Keywords: Vague set, vague reliability, triangular intuitionistic fuzzy number, k-out-of-n : G system, series and parallel system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2934
953 Analysis of Slip Flow Heat Transfer between Asymmetrically Heated Parallel Plates

Authors: Hari Mohan Kushwaha, Santosh K. Sahu

Abstract:

In the present study, analysis of heat transfer is carried out in the slip flow region for the fluid flowing between two parallel plates by employing the asymmetric heat fluxes at surface of the plates. The flow is assumed to be hydrodynamically and thermally fully developed for the analysis. The second order velocity slip and viscous dissipation effects are considered for the analysis. Closed form expressions are obtained for the Nusselt number as a function of Knudsen number and modified Brinkman number. The limiting condition of the present prediction for Kn = 0, Kn2 = 0, and Brq1 = 0 is considered and found to agree well with other analytical results.

Keywords: Knudsen Number, Modified Brinkman Number, Slip Flow, Velocity Slip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
952 A Study of the Variability of Very Low Resolution Characters and the Feasibility of Their Discrimination Using Geometrical Features

Authors: Farshideh Einsele, Rolf Ingold

Abstract:

Current OCR technology does not allow to accurately recognizing small text images, such as those found in web images. Our goal is to investigate new approaches to recognize very low resolution text images containing antialiased character shapes. This paper presents a preliminary study on the variability of such characters and the feasibility to discriminate them by using geometrical features. In a first stage we analyze the distribution of these features. In a second stage we present a study on the discriminative power for recognizing isolated characters, using various rendering methods and font properties. Finally we present interesting results of our evaluation tests leading to our conclusion and future focus.

Keywords: World Wide Web, document analysis, pattern recognition, Optical Character Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1334
951 An efficient Activity Network Reduction Algorithm based on the Label Correcting Tracing Algorithm

Authors: Weng Ming Chu

Abstract:

When faced with stochastic networks with an uncertain duration for their activities, the securing of network completion time becomes problematical, not only because of the non-identical pdf of duration for each node, but also because of the interdependence of network paths. As evidenced by Adlakha & Kulkarni [1], many methods and algorithms have been put forward in attempt to resolve this issue, but most have encountered this same large-size network problem. Therefore, in this research, we focus on network reduction through a Series/Parallel combined mechanism. Our suggested algorithm, named the Activity Network Reduction Algorithm (ANRA), can efficiently transfer a large-size network into an S/P Irreducible Network (SPIN). SPIN can enhance stochastic network analysis, as well as serve as the judgment of symmetry for the Graph Theory.

Keywords: Series/Parallel network, Stochastic network, Network reduction, Interdictive Graph, Complexity Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336
950 Towards Self-ware via Swarm-Array Computing

Authors: Blesson Varghese, Gerard McKee

Abstract:

The work reported in this paper proposes Swarm-Array computing, a novel technique inspired by swarm robotics, and built on the foundations of autonomic and parallel computing. The approach aims to apply autonomic computing constructs to parallel computing systems and in effect achieve the self-ware objectives that describe self-managing systems. The constitution of swarm-array computing comprising four constituents, namely the computing system, the problem/task, the swarm and the landscape is considered. Approaches that bind these constituents together are proposed. Space applications employing FPGAs are identified as a potential area for applying swarm-array computing for building reliable systems. The feasibility of a proposed approach is validated on the SeSAm multi-agent simulator and landscapes are generated using the MATLAB toolkit.

Keywords: Swarm-Array computing, Autonomic computing, landscapes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1546
949 Capture Zone of a Well Field in an Aquifer Bounded by Two Parallel Streams

Authors: S. Nagheli, N. Samani, D. A. Barry

Abstract:

In this paper, the velocity potential and stream function of capture zone for a well field in an aquifer bounded by two parallel streams with or without a uniform regional flow of any directions are presented. The well field includes any number of extraction or injection wells or a combination of both types with any pumping rates. To delineate the capture envelope, the potential and streamlines equations are derived by conformal mapping method. This method can help us to release constrains of other methods. The equations can be applied as useful tools to design in-situ groundwater remediation systems, to evaluate the surface–subsurface water interaction and to manage the water resources.

Keywords: Complex potential, conformal mapping, groundwater remediation, image well theory, Laplace’s equation, superposition principle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 820
948 Component-based Segmentation of Words from Handwritten Arabic Text

Authors: Jawad H AlKhateeb, Jianmin Jiang, Jinchang Ren, Stan S Ipson

Abstract:

Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words segmentation. Meanwhile, an improved projection based method is also employed for baseline detection. The proposed method has been successfully tested on IFN/ENIT database consisting of 26459 Arabic words handwritten by 411 different writers, and the results were promising and very encouraging in more accurate detection of the baseline and segmentation of words for further recognition.

Keywords: Arabic OCR, off-line recognition, Baseline estimation, Word segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
947 Emotions in Health Tweets: Analysis of American Government Official Accounts

Authors: García López

Abstract:

The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: Emotions in tweets emotion detection in text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 660
946 Techniques with Statistics for Web Page Watermarking

Authors: Mohamed Lahcen BenSaad, Sun XingMing

Abstract:

Information hiding, especially watermarking is a promising technique for the protection of intellectual property rights. This technology is mainly advanced for multimedia but the same has not been done for text. Web pages, like other documents, need a protection against piracy. In this paper, some techniques are proposed to show how to hide information in web pages using some features of the markup language used to describe these pages. Most of the techniques proposed here use the white space to hide information or some varieties of the language in representing elements. Experiments on a very small page and analysis of five thousands web pages show that these techniques have a wide bandwidth available for information hiding, and they might form a solid base to develop a robust algorithm for web page watermarking.

Keywords: Digital Watermarking, Information Hiding, Markup Language, Text watermarking, Software Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
945 Calibration of Parallel Multi-View Cameras

Authors: M. Ali-Bey, N. Manamanni, S. Moughamir

Abstract:

This paper focuses on the calibration problem of a multi-view shooting system designed for the production of 3D content for auto-stereoscopic visualization. The considered multiview camera is characterized by coplanar and decentered image sensors regarding to the corresponding optical axis. Based on the Faugéras and Toscani-s calibration approach, a calibration method is herein proposed for the case of multi-view camera with parallel and decentered image sensors. At first, the geometrical model of the shooting system is recalled and some industrial prototypes with some shooting simulations are presented. Next, the development of the proposed calibration method is detailed. Finally, some simulation results are presented before ending with some conclusions about this work.

Keywords: Auto-stereoscopic display, camera calibration, multi-view cameras, visual servoing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650
944 Compression of Semistructured Documents

Authors: Leo Galambos, Jan Lansky, Katsiaryna Chernik

Abstract:

EGOTHOR is a search engine that indexes the Web and allows us to search the Web documents. Its hit list contains URL and title of the hits, and also some snippet which tries to shortly show a match. The snippet can be almost always assembled by an algorithm that has a full knowledge of the original document (mostly HTML page). It implies that the search engine is required to store the full text of the documents as a part of the index. Such a requirement leads us to pick up an appropriate compression algorithm which would reduce the space demand. One of the solutions could be to use common compression methods, for instance gzip or bzip2, but it might be preferable if we develop a new method which would take advantage of the document structure, or rather, the textual character of the documents. There already exist a special compression text algorithms and methods for a compression of XML documents. The aim of this paper is an integration of the two approaches to achieve an optimal level of the compression ratio

Keywords: Compression, search engine, HTML, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
943 A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

Authors: G. Shruthi, Purohit Shrinivasacharya

Abstract:

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

Keywords: Attribute, encryption, security, trapdoor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691
942 Event Information Extraction System (EIEE): FSM vs HMM

Authors: Shaukat Wasi, Zubair A. Shaikh, Sajid Qasmi, Hussain Sachwani, Rehman Lalani, Aamir Chagani

Abstract:

Automatic Extraction of Event information from social text stream (emails, social network sites, blogs etc) is a vital requirement for many applications like Event Planning and Management systems and security applications. The key information components needed from Event related text are Event title, location, participants, date and time. Emails have very unique distinctions over other social text streams from the perspective of layout and format and conversation style and are the most commonly used communication channel for broadcasting and planning events. Therefore we have chosen emails as our dataset. In our work, we have employed two statistical NLP methods, named as Finite State Machines (FSM) and Hidden Markov Model (HMM) for the extraction of event related contextual information. An application has been developed providing a comparison among the two methods over the event extraction task. It comprises of two modules, one for each method, and works for both bulk as well as direct user input. The results are evaluated using Precision, Recall and F-Score. Experiments show that both methods produce high performance and accuracy, however HMM was good enough over Title extraction and FSM proved to be better for Venue, Date, and time.

Keywords: Emails, Event Extraction, Event Detection, Finite state machines, Hidden Markov Model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2273
941 Designing Ontology-Based Knowledge Integration for Preprocessing of Medical Data in Enhancing a Machine Learning System for Coding Assignment of a Multi-Label Medical Text

Authors: Phanu Waraporn

Abstract:

This paper discusses the designing of knowledge integration of clinical information extracted from distributed medical ontologies in order to ameliorate a machine learning-based multilabel coding assignment system. The proposed approach is implemented using a decision tree technique of the machine learning on the university hospital data for patients with Coronary Heart Disease (CHD). The preliminary results obtained show a satisfactory finding that the use of medical ontologies improves the overall system performance.

Keywords: Medical Ontology, Knowledge Integration, Machine Learning, Medical Coding, Text Assignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
940 Sorting Primitives and Genome Rearrangementin Bioinformatics: A Unified Perspective

Authors: Swapnoneel Roy, Minhazur Rahman, Ashok Kumar Thakur

Abstract:

Bioinformatics and computational biology involve the use of techniques including applied mathematics, informatics, statistics, computer science, artificial intelligence, chemistry, and biochemistry to solve biological problems usually on the molecular level. Research in computational biology often overlaps with systems biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and proteinprotein interactions, and the modeling of evolution. Various global rearrangements of permutations, such as reversals and transpositions,have recently become of interest because of their applications in computational molecular biology. A reversal is an operation that reverses the order of a substring of a permutation. A transposition is an operation that swaps two adjacent substrings of a permutation. The problem of determining the smallest number of reversals required to transform a given permutation into the identity permutation is called sorting by reversals. Similar problems can be defined for transpositions and other global rearrangements. In this work we perform a study about some genome rearrangement primitives. We show how a genome is modelled by a permutation, introduce some of the existing primitives and the lower and upper bounds on them. We then provide a comparison of the introduced primitives.

Keywords: Sorting Primitives, Genome Rearrangements, Transpositions, Block Interchanges, Strip Exchanges.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2102