Search results for: query rewriting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 176

Search results for: query rewriting

26 Energy Efficient In-Network Data Processing in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

The Sensor Network consists of densely deployed sensor nodes. Energy optimization is one of the most important aspects of sensor application design. Data acquisition and aggregation techniques for processing data in-network should be energy efficient. Due to the cross-layer design, resource-limited and noisy nature of Wireless Sensor Networks(WSNs), it is challenging to study the performance of these systems in a realistic setting. In this paper, we propose optimizing queries by aggregation of data and data redundancy to reduce energy consumption without requiring all sensed data and directed diffusion communication paradigm to achieve power savings, robust communication and processing data in-network. To estimate the per-node power consumption POWERTossim mica2 energy model is used, which provides scalable and accurate results. The performance analysis shows that the proposed methods overcomes the existing methods in the aspects of energy consumption in wireless sensor networks.

Keywords: Data Aggregation, Directed Diffusion, Partial Aggregation, Packet Merging, Query Plan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1790
25 Categorizing Search Result Records Using Word Sense Disambiguation

Authors: R. Babisaraswathi, N. Shanthi, S. S. Kiruthika

Abstract:

Web search engines are designed to retrieve and extract the information in the web databases and to return dynamic web pages. The Semantic Web is an extension of the current web in which it includes semantic content in web pages. The main goal of semantic web is to promote the quality of the current web by changing its contents into machine understandable form. Therefore, the milestone of semantic web is to have semantic level information in the web. Nowadays, people use different keyword- based search engines to find the relevant information they need from the web. But many of the words are polysemous. When these words are used to query a search engine, it displays the Search Result Records (SRRs) with different meanings. The SRRs with similar meanings are grouped together based on Word Sense Disambiguation (WSD). In addition to that semantic annotation is also performed to improve the efficiency of search result records. Semantic Annotation is the process of adding the semantic metadata to web resources. Thus the grouped SRRs are annotated and generate a summary which describes the information in SRRs. But the automatic semantic annotation is a significant challenge in the semantic web. Here ontology and knowledge based representation are used to annotate the web pages.

Keywords: Ontology, Semantic Web, WordNet, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716
24 Application of Exact String Matching Algorithms towards SMILES Representation of Chemical Structure

Authors: Ahmad Fadel Klaib, Zurinahni Zainol, Nurul Hashimah Ahamed, Rosma Ahmad, Wahidah Hussin

Abstract:

Bioinformatics and Cheminformatics use computer as disciplines providing tools for acquisition, storage, processing, analysis, integrate data and for the development of potential applications of biological and chemical data. A chemical database is one of the databases that exclusively designed to store chemical information. NMRShiftDB is one of the main databases that used to represent the chemical structures in 2D or 3D structures. SMILES format is one of many ways to write a chemical structure in a linear format. In this study we extracted Antimicrobial Structures in SMILES format from NMRShiftDB and stored it in our Local Data Warehouse with its corresponding information. Additionally, we developed a searching tool that would response to user-s query using the JME Editor tool that allows user to draw or edit molecules and converts the drawn structure into SMILES format. We applied Quick Search algorithm to search for Antimicrobial Structures in our Local Data Ware House.

Keywords: Exact String-matching Algorithms, NMRShiftDB, SMILES Format, Antimicrobial Structures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2173
23 Novel Nanomagnetic Beads Based - Latex Agglutination Assay for Rapid Diagnosis of Human Schistosomiasis Haematobium

Authors: Ibrahim Aly , Rabab Zalat, Bahaa EL Deen W. El Aswad, Ismail M. Moharm , Basam M. Masoud, Tarek Diab

Abstract:

The objective of the present study was to evaluate the novel nanomagnetic beads based–latex agglutination assay (NMB-LAT) as a simple test for diagnosis of S. haematobium as well as standardize the novel nanomagnetic beads based –ELISA (NMB-ELISA). According to urine examination this study included 85 S. haematobium infected patients, 30 other parasites infected patients and 25 negative control samples. The sensitivity of novel NMB-LAT was 82.4% versus 96.5% and 88.2% for NMB-ELISA and currently used sandwich ELISA respectively. The specificity of NMB-LAT was 83.6% versus 96.3% and 87.3% for NMB-ELISA and currently used sandwich ELISA respectively. In conclusion, the novel NMB-ELISA is a valuable applicable diagnostic technique for diagnosis of human schistosomiasis haematobium. The novel NMB-ELISA assay is a suitable applicable diagnostic method in field survey especially when followed by ELISA as a confirmatory test in query false negative results. Trials are required to increase the sensitivity and specificity of NMB-ELISA assay.

Keywords: Diagnosis, Latex agglutination, Nanomagnetic beads, Sandwich ELISA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3055
22 An Approach for Reducing the End-to-end Delay and Increasing Network Lifetime in Mobile Adhoc Networks

Authors: R. Asokan, A. M. Natarajan

Abstract:

Mobile adhoc network (MANET) is a collection of mobile devices which form a communication network with no preexisting wiring or infrastructure. Multiple routing protocols have been developed for MANETs. As MANETs gain popularity, their need to support real time applications is growing as well. Such applications have stringent quality of service (QoS) requirements such as throughput, end-to-end delay, and energy. Due to dynamic topology and bandwidth constraint supporting QoS is a challenging task. QoS aware routing is an important building block for QoS support. The primary goal of the QoS aware protocol is to determine the path from source to destination that satisfies the QoS requirements. This paper proposes a new energy and delay aware protocol called energy and delay aware TORA (EDTORA) based on extension of Temporally Ordered Routing Protocol (TORA).Energy and delay verifications of query packet have been done in each node. Simulation results show that the proposed protocol has a higher performance than TORA in terms of network lifetime, packet delivery ratio and end-to-end delay.

Keywords: EDTORA, Mobile Adhoc Networks, QoS, Routing, TORA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
21 Effective Digital Music Retrieval System through Content-based Features

Authors: Bokyung Sung, Kwanghyo Koo, Jungsoo Kim, Myung-Bum Jung, Jinman Kwon, Ilju Ko

Abstract:

In this paper, we propose effective system for digital music retrieval. We divided proposed system into Client and Server. Client part consists of pre-processing and Content-based feature extraction stages. In pre-processing stage, we minimized Time code Gap that is occurred among same music contents. As content-based feature, first-order differentiated MFCC were used. These presented approximately envelop of music feature sequences. Server part included Music Server and Music Matching stage. Extracted features from 1,000 digital music files were stored in Music Server. In Music Matching stage, we found retrieval result through similarity measure by DTW. In experiment, we used 450 queries. These were made by mixing different compression standards and sound qualities from 50 digital music files. Retrieval accurate indicated 97% and retrieval time was average 15ms in every single query. Out experiment proved that proposed system is effective in retrieve digital music and robust at various user environments of web.

Keywords: Music Retrieval, Content-based, Music Feature and Digital Music.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475
20 A Real-Time Rendering based on Efficient Updating of Static Objects Buffer

Authors: Youngjae Chun, Kyoungsu Oh

Abstract:

Real-time 3D applications have to guarantee interactive rendering speed. There is a restriction for the number of polygons which is rendered due to performance of a graphics hardware or graphics algorithms. Generally, the rendering performance will be drastically increased when handling only the dynamic 3d models, which is much fewer than the static ones. Since shapes and colors of the static objects don-t change when the viewing direction is fixed, the information can be reused. We render huge amounts of polygon those cannot handled by conventional rendering techniques in real-time by using a static object image and merging it with rendering result of the dynamic objects. The performance must be decreased as a consequence of updating the static object image including removing an static object that starts to move, re-rending the other static objects being overlapped by the moving ones. Based on visibility of the object beginning to move, we can skip the updating process. As a result, we enhance rendering performance and reduce differences of rendering speed between each frame. Proposed method renders total 200,000,000 polygons that consist of 500,000 dynamic polygons and the rest are static polygons in about 100 frames per second.

Keywords: Occlusion query, Real-time rendering, Temporal coherence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
19 Leveraging Quality Metrics in Voting Model Based Thread Retrieval

Authors: Atefeh Heydari, Mohammadali Tavakoli, Zuriati Ismail, Naomie Salim

Abstract:

Seeking and sharing knowledge on online forums have made them popular in recent years. Although online forums are valuable sources of information, due to variety of sources of messages, retrieving reliable threads with high quality content is an issue. Majority of the existing information retrieval systems ignore the quality of retrieved documents, particularly, in the field of thread retrieval. In this research, we present an approach that employs various quality features in order to investigate the quality of retrieved threads. Different aspects of content quality, including completeness, comprehensiveness, and politeness, are assessed using these features, which lead to finding not only textual, but also conceptual relevant threads for a user query within a forum. To analyse the influence of the features, we used an adopted version of voting model thread search as a retrieval system. We equipped it with each feature solely and also various combinations of features in turn during multiple runs. The results show that incorporating the quality features enhances the effectiveness of the utilised retrieval system significantly.

Keywords: Content quality, Forum search, Thread retrieval, Voting techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1712
18 Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes

Authors: Ibrahim Gomaa, Hoda M. O. Mokhtar

Abstract:

Although most of the existing skyline queries algorithms focused basically on querying static points through static databases; with the expanding number of sensors, wireless communications and mobile applications, the demand for continuous skyline queries has increased. Unlike traditional skyline queries which only consider static attributes, continuous skyline queries include dynamic attributes, as well as the static ones. However, as skyline queries computation is based on checking the domination of skyline points over all dimensions, considering both the static and dynamic attributes without separation is required. In this paper, we present an efficient algorithm for computing continuous skyline queries without discriminating between static and dynamic attributes. Our algorithm in brief proceeds as follows: First, it excludes the points which will not be in the initial skyline result; this pruning phase reduces the required number of comparisons. Second, the association between the spatial positions of data points is examined; this phase gives an idea of where changes in the result might occur and consequently enables us to efficiently update the skyline result (continuous update) rather than computing the skyline from scratch. Finally, experimental evaluation is provided which demonstrates the accuracy, performance and efficiency of our algorithm over other existing approaches.

Keywords: Continuous query processing, dynamic database, moving object, skyline queries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1207
17 EEIA: Energy Efficient Indexed Aggregation in Smart Wireless Sensor Networks

Authors: Mohamed Watfa, William Daher, Hisham Al Azar

Abstract:

The main idea behind in network aggregation is that, rather than sending individual data items from sensors to sinks, multiple data items are aggregated as they are forwarded by the sensor network. Existing sensor network data aggregation techniques assume that the nodes are preprogrammed and send data to a central sink for offline querying and analysis. This approach faces two major drawbacks. First, the system behavior is preprogrammed and cannot be modified on the fly. Second, the increased energy wastage due to the communication overhead will result in decreasing the overall system lifetime. Thus, energy conservation is of prime consideration in sensor network protocols in order to maximize the network-s operational lifetime. In this paper, we give an energy efficient approach to query processing by implementing new optimization techniques applied to in-network aggregation. We first discuss earlier approaches in sensors data management and highlight their disadvantages. We then present our approach “Energy Efficient Indexed Aggregation" (EEIA) and evaluate it through several simulations to prove its efficiency, competence and effectiveness.

Keywords: Sensor Networks, Data Base, Data Fusion, Aggregation, Indexing, Energy Efficiency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1748
16 Content-Based Image Retrieval Using HSV Color Space Features

Authors: Hamed Qazanfari, Hamid Hassanpour, Kazem Qazanfari

Abstract:

In this paper, a method is provided for content-based image retrieval. Content-based image retrieval system searches query an image based on its visual content in an image database to retrieve similar images. In this paper, with the aim of simulating the human visual system sensitivity to image's edges and color features, the concept of color difference histogram (CDH) is used. CDH includes the perceptually color difference between two neighboring pixels with regard to colors and edge orientations. Since the HSV color space is close to the human visual system, the CDH is calculated in this color space. In addition, to improve the color features, the color histogram in HSV color space is also used as a feature. Among the extracted features, efficient features are selected using entropy and correlation criteria. The final features extract the content of images most efficiently. The proposed method has been evaluated on three standard databases Corel 5k, Corel 10k and UKBench. Experimental results show that the accuracy of the proposed image retrieval method is significantly improved compared to the recently developed methods.

Keywords: Content-based image retrieval, color difference histogram, efficient features selection, entropy, correlation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 608
15 Improving Spatiotemporal Change Detection: A High Level Fusion Approach for Discovering Uncertain Knowledge from Satellite Image Database

Authors: Wadii Boulila, Imed Riadh Farah, Karim Saheb Ettabaa, Basel Solaiman, Henda Ben Ghezala

Abstract:

This paper investigates the problem of tracking spa¬tiotemporal changes of a satellite image through the use of Knowledge Discovery in Database (KDD). The purpose of this study is to help a given user effectively discover interesting knowledge and then build prediction and decision models. Unfortunately, the KDD process for spatiotemporal data is always marked by several types of imperfections. In our paper, we take these imperfections into consideration in order to provide more accurate decisions. To achieve this objective, different KDD methods are used to discover knowledge in satellite image databases. Each method presents a different point of view of spatiotemporal evolution of a query model (which represents an extracted object from a satellite image). In order to combine these methods, we use the evidence fusion theory which considerably improves the spatiotemporal knowledge discovery process and increases our belief in the spatiotemporal model change. Experimental results of satellite images representing the region of Auckland in New Zealand depict the improvement in the overall change detection as compared to using classical methods.

Keywords: Knowledge discovery in satellite databases, knowledge fusion, data imperfection, data mining, spatiotemporal change detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497
14 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1664
13 Introducing Sequence-Order Constraint into Prediction of Protein Binding Sites with Automatically Extracted Templates

Authors: Yi-Zhong Weng, Chien-Kang Huang, Yu-Feng Huang, Chi-Yuan Yu, Darby Tien-Hao Chang

Abstract:

Search for a tertiary substructure that geometrically matches the 3D pattern of the binding site of a well-studied protein provides a solution to predict protein functions. In our previous work, a web server has been built to predict protein-ligand binding sites based on automatically extracted templates. However, a drawback of such templates is that the web server was prone to resulting in many false positive matches. In this study, we present a sequence-order constraint to reduce the false positive matches of using automatically extracted templates to predict protein-ligand binding sites. The binding site predictor comprises i) an automatically constructed template library and ii) a local structure alignment algorithm for querying the library. The sequence-order constraint is employed to identify the inconsistency between the local regions of the query protein and the templates. Experimental results reveal that the sequence-order constraint can largely reduce the false positive matches and is effective for template-based binding site prediction.

Keywords: Protein structure, binding site, functional prediction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416
12 Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System

Authors: A. Gruzdz, A. Ihnatowicz, J. Siddiqi, B. Akhgar

Abstract:

MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.

Keywords: Bioinformatics, gene expression, ontology, selforganizingmaps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931
11 SPA-VNDN: Enhanced Smart Parking Application by Vehicular Named Data Networking

Authors: Bassma Aldahlan, Zongming Fei

Abstract:

Recently, there is a great interest in smart parking application. Theses applications are enhanced by a vehicular ad-hoc network, which helps drivers find and reserve satiable packing spaces for a period of time ahead of time. Named Data Networking (NDN) is a future Internet architecture that benefits vehicular ad-hoc networks because of its clean-slate design and pure communication model. In this paper, we proposed an NDN-based frame-work for smart parking that involved a fog computing architecture. The proposed application had two main directions: First, we allowed drivers to query the number of parking spaces in a particular parking lot. Second, we introduced a technique that enabled drivers to make intelligent reservations before their arrival time. We also introduced a “push-based” model supporting the NDN-based framework for smart parking applications. To evaluate the proposed solution’s performance, we analyzed the function for finding parking lots with available parking spaces and the function for reserving a parking space. Our system showed high performance results in terms of response time and push overhead. The proposed reservation application performed better than the baseline approach.

Keywords: Cloud Computing, Vehicular Named Data Networking, Smart Parking Applications, Fog Computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 147
10 Detecting Earnings Management via Statistical and Neural Network Techniques

Authors: Mohammad Namazi, Mohammad Sadeghzadeh Maharluie

Abstract:

Predicting earnings management is vital for the capital market participants, financial analysts and managers. The aim of this research is attempting to respond to this query: Is there a significant difference between the regression model and neural networks’ models in predicting earnings management, and which one leads to a superior prediction of it? In approaching this question, a Linear Regression (LR) model was compared with two neural networks including Multi-Layer Perceptron (MLP), and Generalized Regression Neural Network (GRNN). The population of this study includes 94 listed companies in Tehran Stock Exchange (TSE) market from 2003 to 2011. After the results of all models were acquired, ANOVA was exerted to test the hypotheses. In general, the summary of statistical results showed that the precision of GRNN did not exhibit a significant difference in comparison with MLP. In addition, the mean square error of the MLP and GRNN showed a significant difference with the multi variable LR model. These findings support the notion of nonlinear behavior of the earnings management. Therefore, it is more appropriate for capital market participants to analyze earnings management based upon neural networks techniques, and not to adopt linear regression models.

Keywords: Earnings management, generalized regression neural networks, linear regression, multi-layer perceptron, Tehran stock exchange.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2063
9 A New Model for Question Answering Systems

Authors: Mohammad Reza Kangavari, Samira Ghandchi, Manak Golpour

Abstract:

Most of the Question Answering systems composed of three main modules: question processing, document processing and answer processing. Question processing module plays an important role in QA systems. If this module doesn't work properly, it will make problems for other sections. Moreover answer processing module is an emerging topic in Question Answering, where these systems are often required to rank and validate candidate answers. These techniques aiming at finding short and precise answers are often based on the semantic classification. This paper discussed about a new model for question answering which improved two main modules, question processing and answer processing. There are two important components which are the bases of the question processing. First component is question classification that specifies types of question and answer. Second one is reformulation which converts the user's question into an understandable question by QA system in a specific domain. Answer processing module, consists of candidate answer filtering, candidate answer ordering components and also it has a validation section for interacting with user. This module makes it more suitable to find exact answer. In this paper we have described question and answer processing modules with modeling, implementing and evaluating the system. System implemented in two versions. Results show that 'Version No.1' gave correct answer to 70% of questions (30 correct answers to 50 asked questions) and 'version No.2' gave correct answers to 94% of questions (47 correct answers to 50 asked questions).

Keywords: Answer Processing, Classification, QuestionAnswering and Query Reformulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076
8 A Pattern Recognition Neural Network Model for Detection and Classification of SQL Injection Attacks

Authors: Naghmeh Moradpoor Sheykhkanloo

Abstract:

Thousands of organisations store important and confidential information related to them, their customers, and their business partners in databases all across the world. The stored data ranges from less sensitive (e.g. first name, last name, date of birth) to more sensitive data (e.g. password, pin code, and credit card information). Losing data, disclosing confidential information or even changing the value of data are the severe damages that Structured Query Language injection (SQLi) attack can cause on a given database. It is a code injection technique where malicious SQL statements are inserted into a given SQL database by simply using a web browser. In this paper, we propose an effective pattern recognition neural network model for detection and classification of SQLi attacks. The proposed model is built from three main elements of: a Uniform Resource Locator (URL) generator in order to generate thousands of malicious and benign URLs, a URL classifier in order to: 1) classify each generated URL to either a benign URL or a malicious URL and 2) classify the malicious URLs into different SQLi attack categories, and a NN model in order to: 1) detect either a given URL is a malicious URL or a benign URL and 2) identify the type of SQLi attack for each malicious URL. The model is first trained and then evaluated by employing thousands of benign and malicious URLs. The results of the experiments are presented in order to demonstrate the effectiveness of the proposed approach.

Keywords: Neural Networks, pattern recognition, SQL injection attacks, SQL injection attack classification, SQL injection attack detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2779
7 Context Detection in Spreadsheets Based on Automatically Inferred Table Schema

Authors: Alexander Wachtel, Michael T. Franzen, Walter F. Tichy

Abstract:

Programming requires years of training. With natural language and end user development methods, programming could become available to everyone. It enables end users to program their own devices and extend the functionality of the existing system without any knowledge of programming languages. In this paper, we describe an Interactive Spreadsheet Processing Module (ISPM), a natural language interface to spreadsheets that allows users to address ranges within the spreadsheet based on inferred table schema. Using the ISPM, end users are able to search for values in the schema of the table and to address the data in spreadsheets implicitly. Furthermore, it enables them to select and sort the spreadsheet data by using natural language. ISPM uses a machine learning technique to automatically infer areas within a spreadsheet, including different kinds of headers and data ranges. Since ranges can be identified from natural language queries, the end users can query the data using natural language. During the evaluation 12 undergraduate students were asked to perform operations (sum, sort, group and select) using the system and also Excel without ISPM interface, and the time taken for task completion was compared across the two systems. Only for the selection task did users take less time in Excel (since they directly selected the cells using the mouse) than in ISPM, by using natural language for end user software engineering, to overcome the present bottleneck of professional developers.

Keywords: Natural language processing, end user development; natural language interfaces, human computer interaction, data recognition, dialog systems, spreadsheet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078
6 Selecting the Best Sub-Region Indexing the Images in the Case of Weak Segmentation Based On Local Color Histograms

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

Color Histogram is considered as the oldest method used by CBIR systems for indexing images. In turn, the global histograms do not include the spatial information; this is why the other techniques coming later have attempted to encounter this limitation by involving the segmentation task as a preprocessing step. The weak segmentation is employed by the local histograms while other methods as CCV (Color Coherent Vector) are based on strong segmentation. The indexation based on local histograms consists of splitting the image into N overlapping blocks or sub-regions, and then the histogram of each block is computed. The dissimilarity between two images is reduced, as consequence, to compute the distance between the N local histograms of the both images resulting then in N*N values; generally, the lowest value is taken into account to rank images, that means that the lowest value is that which helps to designate which sub-region utilized to index images of the collection being asked. In this paper, we make under light the local histogram indexation method in the hope to compare the results obtained against those given by the global histogram. We address also another noteworthy issue when Relying on local histograms namely which value, among N*N values, to trust on when comparing images, in other words, which sub-region among the N*N sub-regions on which we base to index images. Based on the results achieved here, it seems that relying on the local histograms, which needs to pose an extra overhead on the system by involving another preprocessing step naming segmentation, does not necessary mean that it produces better results. In addition to that, we have proposed here some ideas to select the local histogram on which we rely on to encode the image rather than relying on the local histogram having lowest distance with the query histograms.

Keywords: CBIR, Color Global Histogram, Color Local Histogram, Weak Segmentation, Euclidean Distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
5 Information Retrieval in Domain Specific Search Engine with Machine Learning Approaches

Authors: Shilpy Sharma

Abstract:

As the web continues to grow exponentially, the idea of crawling the entire web on a regular basis becomes less and less feasible, so the need to include information on specific domain, domain-specific search engines was proposed. As more information becomes available on the World Wide Web, it becomes more difficult to provide effective search tools for information access. Today, people access web information through two main kinds of search interfaces: Browsers (clicking and following hyperlinks) and Query Engines (queries in the form of a set of keywords showing the topic of interest) [2]. Better support is needed for expressing one's information need and returning high quality search results by web search tools. There appears to be a need for systems that do reasoning under uncertainty and are flexible enough to recover from the contradictions, inconsistencies, and irregularities that such reasoning involves. In a multi-view problem, the features of the domain can be partitioned into disjoint subsets (views) that are sufficient to learn the target concept. Semi-supervised, multi-view algorithms, which reduce the amount of labeled data required for learning, rely on the assumptions that the views are compatible and uncorrelated. This paper describes the use of semi-structured machine learning approach with Active learning for the “Domain Specific Search Engines". A domain-specific search engine is “An information access system that allows access to all the information on the web that is relevant to a particular domain. The proposed work shows that with the help of this approach relevant data can be extracted with the minimum queries fired by the user. It requires small number of labeled data and pool of unlabelled data on which the learning algorithm is applied to extract the required data.

Keywords: Search engines; machine learning, Informationretrieval, Active logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2041
4 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3091
3 CRYPTO COPYCAT: A Fashion Centric Blockchain Framework for Eliminating Fashion Infringement

Authors: Magdi Elmessiry, Adel Elmessiry

Abstract:

The fashion industry represents a significant portion of the global gross domestic product, however, it is plagued by cheap imitators that infringe on the trademarks which destroys the fashion industry's hard work and investment. While eventually the copycats would be found and stopped, the damage has already been done, sales are missed and direct and indirect jobs are lost. The infringer thrives on two main facts: the time it takes to discover them and the lack of tracking technologies that can help the consumer distinguish them. Blockchain technology is a new emerging technology that provides a distributed encrypted immutable and fault resistant ledger. Blockchain presents a ripe technology to resolve the infringement epidemic facing the fashion industry. The significance of the study is that a new approach leveraging the state of the art blockchain technology coupled with artificial intelligence is used to create a framework addressing the fashion infringement problem. It transforms the current focus on legal enforcement, which is difficult at best, to consumer awareness that is far more effective. The framework, Crypto CopyCat, creates an immutable digital asset representing the actual product to empower the customer with a near real time query system. This combination emphasizes the consumer's awareness and appreciation of the product's authenticity, while provides real time feedback to the producer regarding the fake replicas. The main findings of this study are that implementing this approach can delay the fake product penetration of the original product market, thus allowing the original product the time to take advantage of the market. The shift in the fake adoption results in reduced returns, which impedes the copycat market and moves the emphasis to the original product innovation.

Keywords: Fashion, infringement, Blockchain, artificial intelligence, textiles supply.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1171
2 Evaluation of Pragmatic Information in an English Textbook: Focus on Requests

Authors: Israa A. Qari

Abstract:

Learning to request in a foreign language is a key ability within pragmatics language teaching. This paper examines how requests are taught in English Unlimited Book 3 (Cambridge University Press), an EFL textbook series employed by King Abdulaziz University in Jeddah, Saudi Arabia to teach advanced foundation year students English. The focus of analysis is the evaluation of the request linguistic strategies present in the textbook, frequency of the use of these strategies, and the contextual information provided on the use of these linguistic forms. The researcher collected all the linguistic forms which consisted of the request speech act and divided them into levels employing the CCSARP request coding manual. Findings demonstrated that simple and commonly employed request strategies are introduced. Looking closely at the exercises throughout the chapters, it was noticeable that the book exclusively employed the most direct form of requesting (the imperative) when giving learners instructions: e.g. listen, write, ask, answer, read, look, complete, choose, talk, think, etc. The book also made use of some other request strategies such as ‘hedged performatives’ and ‘query preparatory’. However, it was also found that many strategies were not dealt with in the book, specifically strategies with combined functions (e.g. possibility, ability). On a sociopragmatic level, a strong focus was found to exist on standard situations in which relations between the requester and requestee are clear. In general, contextual information was communicated implicitly only. The textbook did not seem to differentiate between formal and informal request contexts (register) which might consequently impel students to overgeneralize. The paper closes with some recommendations for textbook and curriculum designers. Findings are also contrasted with previous results from similar body of research on EFL requests.

Keywords: EFL, Requests, Saudi, speech acts, textbook evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 391
1 A Temporal QoS Ontology for ERTMS/ETCS

Authors: Marc Sango, Olimpia Hoinaru, Christophe Gransart, Laurence Duchien

Abstract:

Ontologies offer a means for representing and sharing information in many domains, particularly in complex domains. For example, it can be used for representing and sharing information of System Requirement Specification (SRS) of complex systems like the SRS of ERTMS/ETCS written in natural language. Since this system is a real-time and critical system, generic ontologies, such as OWL and generic ERTMS ontologies provide minimal support for modeling temporal information omnipresent in these SRS documents. To support the modeling of temporal information, one of the challenges is to enable representation of dynamic features evolving in time within a generic ontology with a minimal redesign of it. The separation of temporal information from other information can help to predict system runtime operation and to properly design and implement them. In addition, it is helpful to provide a reasoning and querying techniques to reason and query temporal information represented in the ontology in order to detect potential temporal inconsistencies. To address this challenge, we propose a lightweight 3-layer temporal Quality of Service (QoS) ontology for representing, reasoning and querying over temporal and non-temporal information in a complex domain ontology. Representing QoS entities in separated layers can clarify the distinction between the non QoS entities and the QoS entities in an ontology. The upper generic layer of the proposed ontology provides an intuitive knowledge of domain components, specially ERTMS/ETCS components. The separation of the intermediate QoS layer from the lower QoS layer allows us to focus on specific QoS Characteristics, such as temporal or integrity characteristics. In this paper, we focus on temporal information that can be used to predict system runtime operation. To evaluate our approach, an example of the proposed domain ontology for handover operation, as well as a reasoning rule over temporal relations in this domain-specific ontology, are presented.

Keywords: System Requirement Specification, ERTMS/ETCS, Temporal Ontologies, Domain Ontologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3089