Search results for: text search queries
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1341

Search results for: text search queries

861 Phase Control Array Synthesis Using Constrained Accelerated Particle Swarm Optimization

Authors: Mohammad Taha, Dia abu al Nadi

Abstract:

In this paper, the phase control antenna array synthesis is presented. The problem is formulated as a constrained optimization problem that imposes nulls with prescribed level while maintaining the sidelobe at a prescribed level. For efficient use of the algorithm memory, compared to the well known Particle Swarm Optimization (PSO), the Accelerated Particle Swarm Optimization (APSO) is used to estimate the phase parameters of the synthesized array. The objective function is formed using a main objective and set of constraints with penalty factors that measure the violation of each feasible solution in the search space to each constraint. In this case the obtained feasible solution is guaranteed to satisfy all the constraints. Simulation results have shown significant performance increases and a decreased randomness in the parameter search space compared to a single objective conventional particle swarm optimization.

Keywords: Array synthesis, Sidelobe level control, Constrainedoptimization, Accelerated Particle Swarm Optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1903
860 Mining and Visual Management of XML-Based Image Collections

Authors: Khalil Shihab, Nida Al-Chalabi

Abstract:

This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.

Keywords: Data-centric XML, graphical user interfaces, information retrieval, case-based reasoning, fuzzy sets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763
859 A Scalable Media Job Framework for an Open Source Search Engine

Authors: Pooja Mishra, Chris Pollett

Abstract:

This paper explores efficient ways to implement various media-updating features like news aggregation, video conversion, and bulk email handling. All of these jobs share the property that they are periodic in nature, and they all benefit from being handled in a distributed fashion. The data for these jobs also often comes from a social or collaborative source. We isolate the class of periodic, one round map reduce jobs as a useful setting to describe and handle media updating tasks. As such tasks are simpler than general map reduce jobs, programming them in a general map reduce platform could easily become tedious. This paper presents a MediaUpdater module of the Yioop Open Source Search Engine Web Portal designed to handle such jobs via an extension of a PHP class. We describe how to implement various media-updating tasks in our system as well as experiments carried out using these implementations on an Amazon Web Services cluster.

Keywords: Distributed jobs framework, news aggregation, video conversion, email.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 999
858 Using Suffix Tree Document Representation in Hierarchical Agglomerative Clustering

Authors: Daniel I. Morariu, Radu G. Cretulescu, Lucian N. Vintan

Abstract:

In text categorization problem the most used method for documents representation is based on words frequency vectors called VSM (Vector Space Model). This representation is based only on words from documents and in this case loses any “word context" information found in the document. In this article we make a comparison between the classical method of document representation and a method called Suffix Tree Document Model (STDM) that is based on representing documents in the Suffix Tree format. For the STDM model we proposed a new approach for documents representation and a new formula for computing the similarity between two documents. Thus we propose to build the suffix tree only for any two documents at a time. This approach is faster, it has lower memory consumption and use entire document representation without using methods for disposing nodes. Also for this method is proposed a formula for computing the similarity between documents, which improves substantially the clustering quality. This representation method was validated using HAC - Hierarchical Agglomerative Clustering. In this context we experiment also the stemming influence in the document preprocessing step and highlight the difference between similarity or dissimilarity measures to find “closer" documents.

Keywords: Text Clustering, Suffix tree documentrepresentation, Hierarchical Agglomerative Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
857 A Biometric Template Security Approach to Fingerprints Based on Polynomial Transformations

Authors: Ramon Santana

Abstract:

The use of biometric identifiers in the field of information security, access control to resources, authentication in ATMs and banking among others, are of great concern because of the safety of biometric data. In the general architecture of a biometric system have been detected eight vulnerabilities, six of them allow obtaining minutiae template in plain text. The main consequence of obtaining minutia templates is the loss of biometric identifier for life. To mitigate these vulnerabilities several models to protect minutiae templates have been proposed. Several vulnerabilities in the cryptographic security of these models allow to obtain biometric data in plain text. In order to increase the cryptographic security and ease of reversibility, a minutiae templates protection model is proposed. The model aims to make the cryptographic protection and facilitate the reversibility of data using two levels of security. The first level of security is the data transformation level. In this level generates invariant data to rotation and translation, further transformation is irreversible. The second level of security is the evaluation level, where the encryption key is generated and data is evaluated using a defined evaluation function. The model is aimed at mitigating known vulnerabilities of the proposed models, basing its security on the impossibility of the polynomial reconstruction.

Keywords: Fingerprint, template protection, bio-cryptography, minutiae protection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 803
856 Decision Tree-based Feature Ranking using Manhattan Hierarchical Cluster Criterion

Authors: Yasmin Mohd Yacob, Harsa A. Mat Sakim, Nor Ashidi Mat Isa

Abstract:

Feature selection study is gaining importance due to its contribution to save classification cost in terms of time and computation load. In search of essential features, one of the methods to search the features is via the decision tree. Decision tree act as an intermediate feature space inducer in order to choose essential features. In decision tree-based feature selection, some studies used decision tree as a feature ranker with a direct threshold measure, while others remain the decision tree but utilized pruning condition that act as a threshold mechanism to choose features. This paper proposed threshold measure using Manhattan Hierarchical Cluster distance to be utilized in feature ranking in order to choose relevant features as part of the feature selection process. The result is promising, and this method can be improved in the future by including test cases of a higher number of attributes.

Keywords: Feature ranking, decision tree, hierarchical cluster, Manhattan distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1939
855 Promoting Gender Equality within Islamic Tradition via Contextualist Approach

Authors: Ali Akbar

Abstract:

The importance of advancing women’s rights is closely intertwined with the development of civil society and the institutionalization of democracy in Middle Eastern countries. There is indeed an intimate relationship between the process of democratization and promoting gender equality, since democracy necessitates equality between men and women. In order to advance the issue of gender equality, what is required is a solid theoretical framework which has its roots in the reexamination of pre-modern interpretation of certain Qurʾānic passages that seem to have given men more rights than it gives women. This paper suggests that those Muslim scholars who adopt a contextualist approach to the Qurʾānic text and its interpretation provide a solid theoretical background for improving women’s rights. Indeed, the aim of the paper is to discuss how the contextualist approach to the Qurʾānic text and its interpretation given by a number of prominent scholars is capable of promoting the issue of gender equality. The paper concludes that since (1) much of the gender inequality found in the primary sources of Islam as well as pre-modern Muslim writings is rooted in the natural cultural norms and standards of early Islamic societies and (2) since the context of today’s world is so different from that of the pre-modern era, the proposed models provide a solid theoretical framework for promoting women’s rights and gender equality.

Keywords: Contextualism, Gender equality, Islam, Women’s rights.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
854 A Robust Optimization Model for the Single-Depot Capacitated Location-Routing Problem

Authors: Abdolsalam Ghaderi

Abstract:

In this paper, the single-depot capacitated location-routing problem under uncertainty is presented. The problem aims to find the optimal location of a single depot and the routing of vehicles to serve the customers when the parameters may change under different circumstances. This problem has many applications, especially in the area of supply chain management and distribution systems. To get closer to real-world situations, travel time of vehicles, the fixed cost of vehicles usage and customers’ demand are considered as a source of uncertainty. A combined approach including robust optimization and stochastic programming was presented to deal with the uncertainty in the problem at hand. For this purpose, a mixed integer programming model is developed and a heuristic algorithm based on Variable Neighborhood Search(VNS) is presented to solve the model. Finally, the computational results are presented and future research directions are discussed.

Keywords: Location-routing problem, robust optimization, Stochastic Programming, variable neighborhood search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 726
853 The Comparison of Anchor and Star Schema from a Query Performance Perspective

Authors: Radek Němec

Abstract:

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several drawbacks and pitfalls were reported. This paper focuses on the analysis of another data warehouse modeling technique - the anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show information about performance of queries executed on database schemas structured according to principles of each database modeling technique.

Keywords: Data warehousing, anchor modeling, star schema, anchor schema, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3293
852 A New Classification of Risk-Reduction Options to Improve the Risk-Reduction Readiness of the Railway Industry

Authors: Eberechi Weli, Michael Todinov

Abstract:

The gap between the selection of risk-reduction options in the railway industry and the task of their effective implementation results in compromised safety and substantial losses. An effective risk management must necessarily integrate the evaluation phases with the implementation phase. This paper proposes an essential categorisation of risk reduction measures that best addresses a standard railway industry portfolio. By categorising the risk reduction options into design, operational, procedural and technical options, it is guaranteed that the efforts of the implementation facilitators (people, processes and supporting systems) are systematically harmonised. The classification is based on an integration of fundamental principles of risk reduction in the railway industry with the systems engineering approach.

This paper argues that the use of a similar classification approach is an attribute of organisations possessing a superior level of risk-reduction readiness. The integration of the proposed rational classification structure provides a solid ground for effective risk reduction.

Keywords: Cost effectiveness, organisational readiness, risk reduction, railway, system engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
851 Solving Bus Terminal Location Problem Using Genetic Algorithm

Authors: S. Babaie-Kafaki, R. Ghanbari, S.H. Nasseri, E. Ardil

Abstract:

Bus networks design is an important problem in public transportation. The main step to this design, is determining the number of required terminals and their locations. This is an especial type of facility location problem, a large scale combinatorial optimization problem that requires a long time to be solved. The genetic algorithm (GA) is a search and optimization technique which works based on evolutionary principle of natural chromosomes. Specifically, the evolution of chromosomes due to the action of crossover, mutation and natural selection of chromosomes based on Darwin's survival-of-the-fittest principle, are all artificially simulated to constitute a robust search and optimization procedure. In this paper, we first state the problem as a mixed integer programming (MIP) problem. Then we design a new crossover and mutation for bus terminal location problem (BTLP). We tested the different parameters of genetic algorithm (for a sample problem) and obtained the optimal parameters for solving BTLP with numerical try and error.

Keywords: Bus networks, Genetic algorithm (GA), Locationproblem, Mixed integer programming (MIP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2276
850 Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval

Authors: Hager Kammoun, Jean Charles Lamirel, Mohamed Ben Ahmed

Abstract:

In this paper, a model for an information retrieval system is proposed which takes into account that knowledge about documents and information need of users are dynamic. Two methods are combined, one qualitative or symbolic and the other quantitative or numeric, which are deemed suitable for many clustering contexts, data analysis, concept exploring and knowledge discovery. These two methods may be classified as inductive learning techniques. In this model, they are introduced to build “long term" knowledge about past queries and concepts in a collection of documents. The “long term" knowledge can guide and assist the user to formulate an initial query and can be exploited in the process of retrieving relevant information. The different kinds of knowledge are organized in different points of view. This may be considered an enrichment of the exploration level which is coherent with the concept of document/query structure.

Keywords: Information Retrieval Systems, machine learning, classification, Galois lattices, Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1163
849 Estimation of Synchronous Machine Synchronizing and Damping Torque Coefficients

Authors: Khaled M. EL-Naggar

Abstract:

Synchronizing and damping torque coefficients of a synchronous machine can give a quite clear picture for machine behavior during transients. These coefficients are used as a power system transient stability measurement. In this paper, a crow search optimization algorithm is presented and implemented to study the power system stability during transients. The algorithm makes use of the machine responses to perform the stability study in time domain. The problem is formulated as a dynamic estimation problem. An objective function that minimizes the error square in the estimated coefficients is designed. The method is tested using practical system with different study cases. Results are reported and a thorough discussion is presented. The study illustrates that the proposed method can estimate the stability coefficients for the critical stable cases where other methods may fail. The tests proved that the proposed tool is an accurate and reliable tool for estimating the machine coefficients for assessment of power system stability.

Keywords: Optimization, estimation, synchronous, machine, crow search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 633
848 A Methodology for Investigating Public Opinion Using Multilevel Text Analysis

Authors: William Xiu Shun Wong, Myungsu Lim, Yoonjin Hyun, Chen Liu, Seongi Choi, Dasom Kim, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, many users have begun to frequently share their opinions on diverse issues using various social media. Therefore, numerous governments have attempted to establish or improve national policies according to the public opinions captured from various social media. In this paper, we indicate several limitations of the traditional approaches to analyze public opinion on science and technology and provide an alternative methodology to overcome these limitations. First, we distinguish between the science and technology analysis phase and the social issue analysis phase to reflect the fact that public opinion can be formed only when a certain science and technology is applied to a specific social issue. Next, we successively apply a start list and a stop list to acquire clarified and interesting results. Finally, to identify the most appropriate documents that fit with a given subject, we develop a new logical filter concept that consists of not only mere keywords but also a logical relationship among the keywords. This study then analyzes the possibilities for the practical use of the proposed methodology thorough its application to discover core issues and public opinions from 1,700,886 documents comprising SNS, blogs, news, and discussions.

Keywords: Big data, social network analysis, text mining, topic modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
847 A Simple Low-Cost 2-D Optical Measurement System for Linear Guideways

Authors: Wen-Yuh Jywe, Bor-Jeng Lin, Jing-Chung Shen, Jeng-Dao Lee, Hsueh-Liang Huang, Tung-Hsien Hsieh

Abstract:

In this study, a simple 2-D measurement system based on optical design was developed to measure the motion errors of the linear guideway. Compared with the transitional methods about the linear guideway for measuring the motion errors, our proposed 2-D optical measurement system can simultaneously measure horizontal and vertical running straightness errors for the linear guideway.

The performance of the 2-D optical measurement system is verified by experimental results. The standard deviation of the 2-D optical measurement system is about 0.4μm in the measurement range of 100 mm. The maximum measuring speed of the proposed automatic measurement instrument is 1 m/sec.

Keywords: 2-D measurement, linear guideway, motion errors, running straightness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204
846 A Distributed Approach to Extract High Utility Itemsets from XML Data

Authors: S. Kannimuthu, K. Premalatha

Abstract:

This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.

Keywords: Data mining, Knowledge as a Service, service oriented computing, utility mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2423
845 Image Indexing Using a Color Similarity Metric based on the Human Visual System

Authors: Angelo Nodari, Ignazio Gallo

Abstract:

The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.

Keywords: Color Extraction, Content-Based Image Retrieval, Indexing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3003
844 Deep iCrawl: An Intelligent Vision-Based Deep Web Crawler

Authors: R.Anita, V.Ganga Bharani, N.Nityanandam, Pradeep Kumar Sahoo

Abstract:

The explosive growth of World Wide Web has posed a challenging problem in extracting relevant data. Traditional web crawlers focus only on the surface web while the deep web keeps expanding behind the scene. Deep web pages are created dynamically as a result of queries posed to specific web databases. The structure of the deep web pages makes it impossible for traditional web crawlers to access deep web contents. This paper, Deep iCrawl, gives a novel and vision-based approach for extracting data from the deep web. Deep iCrawl splits the process into two phases. The first phase includes Query analysis and Query translation and the second covers vision-based extraction of data from the dynamically created deep web pages. There are several established approaches for the extraction of deep web pages but the proposed method aims at overcoming the inherent limitations of the former. This paper also aims at comparing the data items and presenting them in the required order.

Keywords: Crawler, Deep web, Web Database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2110
843 A Face-to-Face Education Support System Capable of Lecture Adaptation and Q&A Assistance Based On Probabilistic Inference

Authors: Yoshitaka Fujiwara, Jun-ichirou Fukushima, Yasunari Maeda

Abstract:

Keys to high-quality face-to-face education are ensuring flexibility in the way lectures are given, and providing care and responsiveness to learners. This paper describes a face-to-face education support system that is designed to raise the satisfaction of learners and reduce the workload on instructors. This system consists of a lecture adaptation assistance part, which assists instructors in adapting teaching content and strategy, and a Q&A assistance part, which provides learners with answers to their questions. The core component of the former part is a “learning achievement map", which is composed of a Bayesian network (BN). From learners- performance in exercises on relevant past lectures, the lecture adaptation assistance part obtains information required to adapt appropriately the presentation of the next lecture. The core component of the Q&A assistance part is a case base, which accumulates cases consisting of questions expected from learners and answers to them. The Q&A assistance part is a case-based search system equipped with a search index which performs probabilistic inference. A prototype face-to-face education support system has been built, which is intended for the teaching of Java programming, and this approach was evaluated using this system. The expected degree of understanding of each learner for a future lecture was derived from his or her performance in exercises on past lectures, and this expected degree of understanding was used to select one of three adaptation levels. A model for determining the adaptation level most suitable for the individual learner has been identified. An experimental case base was built to examine the search performance of the Q&A assistance part, and it was found that the rate of successfully finding an appropriate case was 56%.

Keywords: Bayesian network, face-to-face education, lecture adaptation, Q&A assistance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1339
842 FIR Filter Design via Linear Complementarity Problem, Messy Genetic Algorithm, and Ising Messy Genetic Algorithm

Authors: A.M. Al-Fahed Nuseirat, R. Abu-Zitar

Abstract:

In this paper the design of maximally flat linear phase finite impulse response (FIR) filters is considered. The problem is handled with totally two different approaches. The first one is completely deterministic numerical approach where the problem is formulated as a Linear Complementarity Problem (LCP). The other one is based on a combination of Markov Random Fields (MRF's) approach with messy genetic algorithm (MGA). Markov Random Fields (MRFs) are a class of probabilistic models that have been applied for many years to the analysis of visual patterns or textures. Our objective is to establish MRFs as an interesting approach to modeling messy genetic algorithms. We establish a theoretical result that every genetic algorithm problem can be characterized in terms of a MRF model. This allows us to construct an explicit probabilistic model of the MGA fitness function and introduce the Ising MGA. Experimentations done with Ising MGA are less costly than those done with standard MGA since much less computations are involved. The least computations of all is for the LCP. Results of the LCP, random search, random seeded search, MGA, and Ising MGA are discussed.

Keywords: Filter design, FIR digital filters, LCP, Ising model, MGA, Ising MGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2000
841 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1637
840 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3615
839 Socio-Cultural Representations through Lived Religions in Dalrymple’s Nine Lives

Authors: Suman

Abstract:

In the continuous interaction between the past and the present that historiography is, each time when history gets re/written, a new representation emerges. This new representation is a reflection of the earlier archives and their interpretations, fragmented remembrances of the past, as well as the reactions to the present. Memory, or lack thereof, and stereotyping generally play a major role in this representation. William Dalrymple’s Nine Lives: In Search of the Sacred in Modern India (2009) is one such written account that sets out to narrate the representations of religion and culture of India and contemporary reactions to it. Dalrymple’s nine saints belong to different castes, sects, religions, and regions. By dealing with their religions and expressions of those religions, and through the lived mysticism of these nine individuals, the book engages with some important issues like class, caste and gender in the contexts provided by historical as well as present India. The paper studies the development of religion and accompanied feeling of religiosity in modern as well as historical contexts through a study of these elements in the book. Since, the language used in creation of texts and the literary texts thus produced create a new reality that questions the stereotypes of the past, and in turn often end up creating new stereotypes or stereotypical representations at times, the paper seeks to actively engage with the text in order to identify and study such stereotypes, along with their changing representations. Through a detailed examination of the book, the paper seeks to unravel whether some socio-cultural stereotypes existed earlier, and whether there is development of new stereotypes from Dalrymple’s point of view as an outsider writing on issues that are deeply rooted in the cultural milieu of the country. For this analysis, the paper takes help from the psycho-literary theories of stereotyping and representation.

Keywords: Religion, Representation, Stereotyping, William Dalrymple.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1068
838 Application of Exact String Matching Algorithms towards SMILES Representation of Chemical Structure

Authors: Ahmad Fadel Klaib, Zurinahni Zainol, Nurul Hashimah Ahamed, Rosma Ahmad, Wahidah Hussin

Abstract:

Bioinformatics and Cheminformatics use computer as disciplines providing tools for acquisition, storage, processing, analysis, integrate data and for the development of potential applications of biological and chemical data. A chemical database is one of the databases that exclusively designed to store chemical information. NMRShiftDB is one of the main databases that used to represent the chemical structures in 2D or 3D structures. SMILES format is one of many ways to write a chemical structure in a linear format. In this study we extracted Antimicrobial Structures in SMILES format from NMRShiftDB and stored it in our Local Data Warehouse with its corresponding information. Additionally, we developed a searching tool that would response to user-s query using the JME Editor tool that allows user to draw or edit molecules and converts the drawn structure into SMILES format. We applied Quick Search algorithm to search for Antimicrobial Structures in our Local Data Ware House.

Keywords: Exact String-matching Algorithms, NMRShiftDB, SMILES Format, Antimicrobial Structures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2188
837 An Algebra for Protein Structure Data

Authors: Yanchao Wang, Rajshekhar Sunderraman

Abstract:

This paper presents an algebraic approach to optimize queries in domain-specific database management system for protein structure data. The approach involves the introduction of several protein structure specific algebraic operators to query the complex data stored in an object-oriented database system. The Protein Algebra provides an extensible set of high-level Genomic Data Types and Protein Data Types along with a comprehensive collection of appropriate genomic and protein functions. The paper also presents a query translator that converts high-level query specifications in algebra into low-level query specifications in Protein-QL, a query language designed to query protein structure data. The query transformation process uses a Protein Ontology that serves the purpose of a dictionary.

Keywords: Domain-Specific Data Management, Protein Algebra, Protein Ontology, Protein Structure Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1512
836 Hybrid Authentication System Using QR Code with OTP

Authors: Salim Istyaq

Abstract:

As we know, number of Internet users are increasing drastically. Now, people are using different online services provided by banks, colleges/schools, hospitals, online utility, bill payment and online shopping sites. To access online services, text-based authentication system is in use. The text-based authentication scheme faces some drawbacks with usability and security issues that bring troubles to users. The core element of computational trust is identity. The aim of the paper is to make the system more compliable for the imposters and more reliable for the users, by using the graphical authentication approach. In this paper, we are using the more powerful tool of encoding the options in graphical QR format and also there will be the acknowledgment which will send to the user’s mobile for final verification. The main methodology depends upon the encryption option and final verification by confirming a set of pass phrase on the legal users, the outcome of the result is very powerful as it only gives the result at once when the process is successfully done. All processes are cross linked serially as the output of the 1st process, is the input of the 2nd and so on. The system is a combination of recognition and pure recall based technique. Presented scheme is useful for devices like PDAs, iPod, phone etc. which are more handy and convenient to use than traditional desktop computer systems.

Keywords: Graphical Password, OTP, QR Codes, Recognition based graphical user authentication, usability and security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1630
835 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lòpez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri

Abstract:

Pretrained transformer-based models have shown high performance in natural language generation task. However, a new wave of interest has surged: automatic programming language generation. This task consists of translating natural language instructions to a programming code. Despite the fact that well-known pretrained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformers neural network. It aims to generate java source code from natural language text. JaCoText leverages advantages of both natural language and code generation models. More specifically, we study some findings from the state of the art and use them to (1) initialize our model from powerful pretrained models, (2) explore additional pretraining on our java dataset, (3) carry out experiments combining the unimodal and bimodal data in the training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: Java code generation, Natural Language Processing, Sequence-to-sequence Models, Transformers Neural Networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 766
834 TOSOM: A Topic-Oriented Self-Organizing Map for Text Organization

Authors: Hsin-Chang Yang, Chung-Hong Lee, Kuo-Lung Ke

Abstract:

The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. However, the SOM has main disadvantage of the need to know the number and structure of neurons prior to training, which are difficult to be determined. Several schemes have been proposed to tackle such deficiency. Examples are growing/expandable SOM, hierarchical SOM, and growing hierarchical SOM. These schemes could dynamically expand the map, even generate hierarchical maps, during training. Encouraging results were reported. Basically, these schemes adapt the size and structure of the map according to the distribution of training data. That is, they are data-driven or dataoriented SOM schemes. In this work, a topic-oriented SOM scheme which is suitable for document clustering and organization will be developed. The proposed SOM will automatically adapt the number as well as the structure of the map according to identified topics. Unlike other data-oriented SOMs, our approach expands the map and generates the hierarchies both according to the topics and their characteristics of the neurons. The preliminary experiments give promising result and demonstrate the plausibility of the method.

Keywords: Self-organizing map, topic identification, learning algorithm, text clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2003
833 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: Emerging technologies, futuristic data, scenario, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2369
832 An Integrative Review of Changes of Family Relationship and Mental Health that Chinese Men Experience during Transition to Fatherhood

Authors: Mo Zhou, Samantha Ashby, Lyn Ebert

Abstract:

In China, the changes that men experience in the perinatal period are not well researched. Men are also at risk of maladaptation to parenthood. The aim of this research is to review current studies regarding changes that Chinese men experience during transitioning to parenthood. 5 databases were employed to search relevant papers. The search found 128 articles. Based on the inclusion and exclusion criteria, 35 articles were included in this integrative review. Results showed the changes that Chinese fathers experienced during the transition to parenthood can be divided into two aspects: family relationships and mental problems. During transition to parenthood, fathers usually experienced an increase in their disappointment with marital conflict resolution and decreased sexual intimacy with their partner. Mental health declined, with fathers often feeling depressed and/or anxious during this time. Some men were diagnosed with clinical depression. The predictors of these changes included three domains: personal background (age and income), family background (gender of infant, relationship status and unplanned child) and cultural background (‘doing the month’, Confucianism, policy, social support).

Keywords: China, fathers, life change, prenatal, postpartum.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 644