Search results for: query performance
12975 Materialized View Effect on Query Performance
Authors: Yusuf Ziya Ayık, Ferhat Kahveci
Abstract:
Currently, database management systems have various tools such as backup and maintenance, and also provide statistical information such as resource usage and security. In terms of query performance, this paper covers query optimization, views, indexed tables, pre-computation materialized view, query performance analysis in which query plan alternatives can be created and the least costly one selected to optimize a query. Indexes and views can be created for related table columns. The literature review of this study showed that, in the course of time, despite the growing capabilities of the database management system, only database administrators are aware of the need for dealing with archival and transactional data types differently. These data may be constantly changing data used in everyday life, and also may be from the completed questionnaire whose data input was completed. For both types of data, the database uses its capabilities; but as shown in the findings section, instead of repeating similar heavy calculations which are carrying out same results with the same query over a survey results, using materialized view results can be in a more simple way. In this study, this performance difference was observed quantitatively considering the cost of the query.Keywords: cost of query, database management systems, materialized view, query performance
Procedia PDF Downloads 28012974 Searching k-Nearest Neighbors to be Appropriate under Gaming Environments
Authors: Jae Moon Lee
Abstract:
In general, algorithms to find continuous k-nearest neighbors have been researched on the location based services, monitoring periodically the moving objects such as vehicles and mobile phone. Those researches assume the environment that the number of query points is much less than that of moving objects and the query points are not moved but fixed. In gaming environments, this problem is when computing the next movement considering the neighbors such as flocking, crowd and robot simulations. In this case, every moving object becomes a query point so that the number of query point is same to that of moving objects and the query points are also moving. In this paper, we analyze the performance of the existing algorithms focused on location based services how they operate under gaming environments.Keywords: flocking behavior, heterogeneous agents, similarity, simulation
Procedia PDF Downloads 30212973 Optimization Query Image Using Search Relevance Re-Ranking Process
Authors: T. G. Asmitha Chandini
Abstract:
Web-based image search re-ranking, as an successful method to get better the results. In a query keyword, the first stair is store the images is first retrieve based on the text-based information. The user to select a query keywordimage, by using this query keyword other images are re-ranked based on their visual properties with images.Now a day to day, people projected to match images in a semantic space which is used attributes or reference classes closely related to the basis of semantic image. though, understanding a worldwide visual semantic space to demonstrate highly different images from the web is difficult and inefficient. The re-ranking images, which automatically offline part learns dissimilar semantic spaces for different query keywords. The features of images are projected into their related semantic spaces to get particular images. At the online stage, images are re-ranked by compare their semantic signatures obtained the semantic précised by the query keyword image. The query-specific semantic signatures extensively improve both the proper and efficiency of image re-ranking.Keywords: Query, keyword, image, re-ranking, semantic, signature
Procedia PDF Downloads 55012972 Smart Online Library Catalog System with Query Expansion for the University of the Cordilleras
Authors: Vincent Ballola, Raymund Dilan, Thelma Palaoag
Abstract:
The Smart Online Library Catalog System with Query Expansion seeks to address the low usage of the library because of the emergence of the Internet. Library users are not accustomed to catalog systems that need a query to have the exact words without any mistakes for decent results to appear. The graphical user interface of the current system has a rather skewed learning curve for users to adapt with. With a simple graphical user interface inspired by Google, users can search quickly just by inputting their query and hitting the search button. Because of the query expansion techniques incorporated into the new system such as stemming, thesaurus search, and weighted search, users can have more efficient results from their query. The system will be adding the root words of the user's query to the query itself which will then be cross-referenced to a thesaurus database to search for any synonyms that will be added to the query. The results will then be arranged by the number of times the word has been searched. Online queries will also be added to the results for additional references. Users showed notable increases in efficiency and usability due to the familiar interface and query expansion techniques incorporated in the system. The simple yet familiar design led to a better user experience. Users also said that they would be more inclined in using the library because of the new system. The incorporation of query expansion techniques gives a notable increase of results to users that in turn gives them a wider range of resources found in the library. Used books mean more knowledge imparted to the users.Keywords: query expansion, catalog system, stemming, weighted search, usability, thesaurus search
Procedia PDF Downloads 38812971 A Query Optimization Strategy for Autonomous Distributed Database Systems
Authors: Dina K. Badawy, Dina M. Ibrahim, Alsayed A. Sallam
Abstract:
Distributed database is a collection of logically related databases that cooperate in a transparent manner. Query processing uses a communication network for transmitting data between sites. It refers to one of the challenges in the database world. The development of sophisticated query optimization technology is the reason for the commercial success of database systems, which complexity and cost increase with increasing number of relations in the query. Mariposa, query trading and query trading with processing task-trading strategies developed for autonomous distributed database systems, but they cause high optimization cost because of involvement of all nodes in generating an optimal plan. In this paper, we proposed a modification on the autonomous strategy K-QTPT that make the seller’s nodes with the lowest cost have gradually high priorities to reduce the optimization time. We implement our proposed strategy and present the results and analysis based on those results.Keywords: autonomous strategies, distributed database systems, high priority, query optimization
Procedia PDF Downloads 52412970 Dynamic Store Procedures in Database
Authors: Muhammet Dursun Kaya, Hasan Asil
Abstract:
In recent years, different methods have been proposed to optimize question processing in database. Although different methods have been proposed to optimize the query, but the problem which exists here is that most of these methods destroy the query execution plan after executing the query. This research attempts to solve the above problem by using a combination of methods of communicating with the database (the present questions in the programming code and using store procedures) and making query processing adaptive in database, and proposing a new approach for optimization of query processing by introducing the idea of dynamic store procedures. This research creates dynamic store procedures in the database according to the proposed algorithm. This method has been tested on applied software and results shows a significant improvement in reducing the query processing time and also reducing the workload of DBMS. Other advantages of this algorithm include: making the programming environment a single environment, eliminating the parametric limitations of the stored procedures in the database, making the stored procedures in the database dynamic, etc.Keywords: relational database, agent, query processing, adaptable, communication with the database
Procedia PDF Downloads 37112969 Domain Adaptive Dense Retrieval with Query Generation
Authors: Rui Yin, Haojie Wang, Xun Li
Abstract:
Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then, the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. We also explore contrastive learning as a method for training domain-adapted dense retrievers and show that it leads to strong performance in various retrieval settings. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.Keywords: dense retrieval, query generation, contrastive learning, unsupervised training
Procedia PDF Downloads 10312968 Unsupervised Domain Adaptive Text Retrieval with Query Generation
Authors: Rui Yin, Haojie Wang, Xun Li
Abstract:
Recently, mainstream dense retrieval methods have obtained state-of-the-art results on some datasets and tasks. However, they require large amounts of training data, which is not available in most domains. The severe performance degradation of dense retrievers on new data domains has limited the use of dense retrieval methods to only a few domains with large training datasets. In this paper, we propose an unsupervised domain-adaptive approach based on query generation. First, a generative model is used to generate relevant queries for each passage in the target corpus, and then the generated queries are used for mining negative passages. Finally, the query-passage pairs are labeled with a cross-encoder and used to train a domain-adapted dense retriever. Experiments show that our approach is more robust than previous methods in target domains that require less unlabeled data.Keywords: dense retrieval, query generation, unsupervised training, text retrieval
Procedia PDF Downloads 7312967 Comparison of Crossover Types to Obtain Optimal Queries Using Adaptive Genetic Algorithm
Authors: Wafa’ Alma'Aitah, Khaled Almakadmeh
Abstract:
this study presents an information retrieval system of using genetic algorithm to increase information retrieval efficiency. Using vector space model, information retrieval is based on the similarity measurement between query and documents. Documents with high similarity to query are judge more relevant to the query and should be retrieved first. Using genetic algorithms, each query is represented by a chromosome; these chromosomes are fed into genetic operator process: selection, crossover, and mutation until an optimized query chromosome is obtained for document retrieval. Results show that information retrieval with adaptive crossover probability and single point type crossover and roulette wheel as selection type give the highest recall. The proposed approach is verified using (242) proceedings abstracts collected from the Saudi Arabian national conference.Keywords: genetic algorithm, information retrieval, optimal queries, crossover
Procedia PDF Downloads 29212966 Representativity Based Wasserstein Active Regression
Authors: Benjamin Bobbia, Matthias Picard
Abstract:
In recent years active learning methodologies based on the representativity of the data seems more promising to limit overfitting. The presented query methodology for regression using the Wasserstein distance measuring the representativity of our labelled dataset compared to the global distribution. In this work a crucial use of GroupSort Neural Networks is made therewith to draw a double advantage. The Wasserstein distance can be exactly expressed in terms of such neural networks. Moreover, one can provide explicit bounds for their size and depth together with rates of convergence. However, heterogeneity of the dataset is also considered by weighting the Wasserstein distance with the error of approximation at the previous step of active learning. Such an approach leads to a reduction of overfitting and high prediction performance after few steps of query. After having detailed the methodology and algorithm, an empirical study is presented in order to investigate the range of our hyperparameters. The performances of this method are compared, in terms of numbers of query needed, with other classical and recent query methods on several UCI datasets.Keywords: active learning, Lipschitz regularization, neural networks, optimal transport, regression
Procedia PDF Downloads 8012965 Book Recommendation Using Query Expansion and Information Retrieval Methods
Authors: Ritesh Kumar, Rajendra Pamula
Abstract:
In this paper, we present our contribution for book recommendation. In our experiment, we combine the results of Sequential Dependence Model (SDM) and exploitation of book information such as reviews, tags and ratings. This social information is assigned by users. For this, we used CLEF-2016 Social Book Search Track Suggestion task. Finally, our proposed method extensively evaluated on CLEF -2015 Social Book Search datasets, and has better performance (nDCG@10) compared to other state-of-the-art systems. Recently we got the good performance in CLEF-2016.Keywords: sequential dependence model, social information, social book search, query expansion
Procedia PDF Downloads 28912964 On Privacy-Preserving Search in the Encrypted Domain
Authors: Chun-Shien Lu
Abstract:
Privacy-preserving query has recently received considerable attention in the signal processing and multimedia community. It is also a critical step in wireless sensor network for retrieval of sensitive data. The purposes of privacy-preserving query in both the areas of signal processing and sensor network are the same, but the similarity and difference of the adopted technologies are not fully explored. In this paper, we first review the recently developed methods of privacy-preserving query, and then describe in a comprehensive manner what we can learn from the mutual of both areas.Keywords: encryption, privacy-preserving, search, security
Procedia PDF Downloads 25612963 User Guidance for Effective Query Interpretation in Natural Language Interfaces to Ontologies
Authors: Aliyu Isah Agaie, Masrah Azrifah Azmi Murad, Nurfadhlina Mohd Sharef, Aida Mustapha
Abstract:
Natural Language Interfaces typically support a restricted language and also have scopes and limitations that naïve users are unaware of, resulting in errors when the users attempt to retrieve information from ontologies. To overcome this challenge, an auto-suggest feature is introduced into the querying process where users are guided through the querying process using interactive query construction system. Guiding users to formulate their queries, while providing them with an unconstrained (or almost unconstrained) way to query the ontology results in better interpretation of the query and ultimately lead to an effective search. The approach described in this paper is unobtrusive and subtly guides the users, so that they have a choice of either selecting from the suggestion list or typing in full. The user is not coerced into accepting system suggestions and can express himself using fragments or full sentences.Keywords: auto-suggest, expressiveness, habitability, natural language interface, query interpretation, user guidance
Procedia PDF Downloads 47412962 Multiple Query Optimization in Wireless Sensor Networks Using Data Correlation
Authors: Elaheh Vaezpour
Abstract:
Data sensing in wireless sensor networks is done by query deceleration the network by the users. In many applications of the wireless sensor networks, many users send queries to the network simultaneously. If the queries are processed separately, the network’s energy consumption will increase significantly. Therefore, it is very important to aggregate the queries before sending them to the network. In this paper, we propose a multiple query optimization framework based on sensors physical and temporal correlation. In the proposed method, queries are merged and sent to network by considering correlation among the sensors in order to reduce the communication cost between the sensors and the base station.Keywords: wireless sensor networks, multiple query optimization, data correlation, reducing energy consumption
Procedia PDF Downloads 33412961 Query Task Modulator: A Computerized Experimentation System to Study Media-Multitasking Behavior
Authors: Premjit K. Sanjram, Gagan Jakhotiya, Apoorv Goyal, Shanu Shukla
Abstract:
In psychological research, laboratory experiments often face the trade-off issue between experimental control and mundane realism. With the advent of Immersive Virtual Environment Technology (IVET), this issue seems to be at bay. However there is a growing challenge within the IVET itself to design and develop system or software that captures the psychological phenomenon of everyday lives. One such phenomena that is of growing interest is ‘media-multitasking’ To aid laboratory researches in media-multitasking this paper introduces Query Task Modulator (QTM), a computerized experimentation system to study media-multitasking behavior in a controlled laboratory environment. The system provides a computerized platform in conducting an experiment for experimenters to study media-multitasking in which participants will be involved in a query task. The system has Instant Messaging, E-mail, and Voice Call features. The answers to queries are provided on the left hand side information panel where participants have to search for it and feed the information in the respective communication media blocks as fast as possible. On the whole the system will collect multitasking behavioral data. To analyze performance there is a separate output table that records the reaction times and responses of the participants individually. Information panel and all the media blocks will appear on a single window in order to ensure multi-modality feature in media-multitasking and equal emphasis on all the tasks (thus avoiding prioritization to a particular task). The paper discusses the development of QTM in the light of current techniques of studying media-multitasking.Keywords: experimentation system, human performance, media-multitasking, query-task
Procedia PDF Downloads 55712960 Semantic Search Engine Based on Query Expansion with Google Ranking and Similarity Measures
Authors: Ahmad Shahin, Fadi Chakik, Walid Moudani
Abstract:
Our study is about elaborating a potential solution for a search engine that involves semantic technology to retrieve information and display it significantly. Semantic search engines are not used widely over the web as the majorities are still in Beta stage or under construction. Many problems face the current applications in semantic search, the major problem is to analyze and calculate the meaning of query in order to retrieve relevant information. Another problem is the ontology based index and its updates. Ranking results according to concept meaning and its relation with query is another challenge. In this paper, we are offering a light meta-engine (QESM) which uses Google search, and therefore Google’s index, with some adaptations to its returned results by adding multi-query expansion. The mission was to find a reliable ranking algorithm that involves semantics and uses concepts and meanings to rank results. At the beginning, the engine finds synonyms of each query term entered by the user based on a lexical database. Then, query expansion is applied to generate different semantically analogous sentences. These are generated randomly by combining the found synonyms and the original query terms. Our model suggests the use of semantic similarity measures between two sentences. Practically, we used this method to calculate semantic similarity between each query and the description of each page’s content generated by Google. The generated sentences are sent to Google engine one by one, and ranked again all together with the adapted ranking method (QESM). Finally, our system will place Google pages with higher similarities on the top of the results. We have conducted experimentations with 6 different queries. We have observed that most ranked results with QESM were altered with Google’s original generated pages. With our experimented queries, QESM generates frequently better accuracy than Google. In some worst cases, it behaves like Google.Keywords: semantic search engine, Google indexing, query expansion, similarity measures
Procedia PDF Downloads 42512959 Distributed Real-Time Range Query Approximation in a Streaming Environment
Authors: Simon Keller, Rainer Mueller
Abstract:
Continuous range queries are a common means to handle mobile clients in high-density areas. Most existing approaches focus on settings in which the range queries for location-based services are more or less static, whereas the mobile clients in the ranges move. We focus on a category called dynamic real-time range queries (DRRQ), assuming that both, clients requested by the query and the inquirers, are mobile. In consequence, the query parameters and the query results continuously change. This leads to two requirements: the ability to deal with an arbitrarily high number of mobile nodes (scalability) and the real-time delivery of range query results. In this paper, we present the highly decentralized solution adaptive quad streaming (AQS) for the requirements of DRRQs. AQS approximates the query results in favor of a controlled real-time delivery and guaranteed scalability. While prior works commonly optimize data structures on the involved servers, we use AQS to focus on a highly distributed cell structure without data structures automatically adapting to changing client distributions. Instead of the commonly used request-response approach, we apply a lightweight streaming method in which no bidirectional communication and no storage or maintenance of queries are required at all.Keywords: approximation of client distributions, continuous spatial range queries, mobile objects, streaming-based decentralization in spatial mobile environments
Procedia PDF Downloads 14612958 Performance Evaluation of Hierarchical Location-Based Services Coupled to the Greedy Perimeter Stateless Routing Protocol for Wireless Sensor Networks
Authors: Rania Khadim, Mohammed Erritali, Abdelhakim Maaden
Abstract:
Nowadays Wireless Sensor Networks have attracted worldwide research and industrial interest, because they can be applied in various areas. Geographic routing protocols are very suitable to those networks because they use location information when they need to route packets. Obviously, location information is maintained by Location-Based Services provided by network nodes in a distributed way. In this paper we choose to evaluate the performance of two hierarchical rendezvous location based-services, GLS (Grid Location Service) and HLS (Hierarchical Location Service) coupled to the GPSR routing protocol (Greedy Perimeter Stateless Routing) for Wireless Sensor Network. The simulations were performed using NS2 simulator to evaluate the performance and power of the two services in term of location overhead, the request travel time (RTT) and the query Success ratio (QSR). This work presents also a new scalability performance study of both GLS and HLS, specifically, what happens if the number of nodes N increases. The study will focus on three qualitative metrics: The location maintenance cost, the location query cost and the storage cost.Keywords: location based-services, routing protocols, scalability, wireless sensor networks
Procedia PDF Downloads 37212957 Towards Learning Query Expansion
Authors: Ahlem Bouziri, Chiraz Latiri, Eric Gaussier
Abstract:
The steady growth in the size of textual document collections is a key progress-driver for modern information retrieval techniques whose effectiveness and efficiency are constantly challenged. Given a user query, the number of retrieved documents can be overwhelmingly large, hampering their efficient exploitation by the user. In addition, retaining only relevant documents in a query answer is of paramount importance for an effective meeting of the user needs. In this situation, the query expansion technique offers an interesting solution for obtaining a complete answer while preserving the quality of retained documents. This mainly relies on an accurate choice of the added terms to an initial query. Interestingly enough, query expansion takes advantage of large text volumes by extracting statistical information about index terms co-occurrences and using it to make user queries better fit the real information needs. In this respect, a promising track consists in the application of data mining methods to extract dependencies between terms, namely a generic basis of association rules between terms. The key feature of our approach is a better trade off between the size of the mining result and the conveyed knowledge. Thus, face to the huge number of derived association rules and in order to select the optimal combination of query terms from the generic basis, we propose to model the problem as a classification problem and solve it using a supervised learning algorithm such as SVM or k-means. For this purpose, we first generate a training set using a genetic algorithm based approach that explores the association rules space in order to find an optimal set of expansion terms, improving the MAP of the search results. The experiments were performed on SDA 95 collection, a data collection for information retrieval. It was found that the results were better in both terms of MAP and NDCG. The main observation is that the hybridization of text mining techniques and query expansion in an intelligent way allows us to incorporate the good features of all of them. As this is a preliminary attempt in this direction, there is a large scope for enhancing the proposed method.Keywords: supervised leaning, classification, query expansion, association rules
Procedia PDF Downloads 32412956 Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data
Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz
Abstract:
In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.Keywords: real-time spatial big data, quality of service, vertical partitioning, horizontal partitioning, matching algorithm, hamming distance, stream query
Procedia PDF Downloads 15712955 Regression Approach for Optimal Purchase of Hosts Cluster in Fixed Fund for Hadoop Big Data Platform
Authors: Haitao Yang, Jianming Lv, Fei Xu, Xintong Wang, Yilin Huang, Lanting Xia, Xuewu Zhu
Abstract:
Given a fixed fund, purchasing fewer hosts of higher capability or inversely more of lower capability is a must-be-made trade-off in practices for building a Hadoop big data platform. An exploratory study is presented for a Housing Big Data Platform project (HBDP), where typical big data computing is with SQL queries of aggregate, join, and space-time condition selections executed upon massive data from more than 10 million housing units. In HBDP, an empirical formula was introduced to predict the performance of host clusters potential for the intended typical big data computing, and it was shaped via a regression approach. With this empirical formula, it is easy to suggest an optimal cluster configuration. The investigation was based on a typical Hadoop computing ecosystem HDFS+Hive+Spark. A proper metric was raised to measure the performance of Hadoop clusters in HBDP, which was tested and compared with its predicted counterpart, on executing three kinds of typical SQL query tasks. Tests were conducted with respect to factors of CPU benchmark, memory size, virtual host division, and the number of element physical host in cluster. The research has been applied to practical cluster procurement for housing big data computing.Keywords: Hadoop platform planning, optimal cluster scheme at fixed-fund, performance predicting formula, typical SQL query tasks
Procedia PDF Downloads 23212954 Evaluation of NoSQL in the Energy Marketplace with GraphQL Optimization
Authors: Michael Howard
Abstract:
The growing popularity of electric vehicles in the United States requires an ever-expanding infrastructure of commercial DC fast charging stations. The U.S. Department of Energy estimates 33,355 publicly available DC fast charging stations as of September 2023. In 2017, 115,370 gasoline stations were operating in the United States, much more ubiquitous than DC fast chargers. Range anxiety is an important impediment to the adoption of electric vehicles and is even more relevant in underserved regions in the country. The peer-to-peer energy marketplace helps fill the demand by allowing private home and small business owners to rent their 240 Volt, level-2 charging facilities. The existing, publicly accessible outlets are wrapped with a Cloud-connected microcontroller managing security and charging sessions. These microcontrollers act as Edge devices communicating with a Cloud message broker, while both buyer and seller users interact with the framework via a web-based user interface. The database storage used by the marketplace framework is a key component in both the cost of development and the performance that contributes to the user experience. A traditional storage solution is the SQL database. The architecture and query language have been in existence since the 1970s and are well understood and documented. The Structured Query Language supported by the query engine provides fine granularity with user query conditions. However, difficulty in scaling across multiple nodes and cost of its server-based compute have resulted in a trend in the last 20 years towards other NoSQL, serverless approaches. In this study, we evaluate the NoSQL vs. SQL solutions through a comparison of Google Cloud Firestore and Cloud SQL MySQL offerings. The comparison pits Google's serverless, document-model, non-relational, NoSQL against the server-base, table-model, relational, SQL service. The evaluation is based on query latency, flexibility/scalability, and cost criteria. Through benchmarking and analysis of the architecture, we determine whether Firestore can support the energy marketplace storage needs and if the introduction of a GraphQL middleware layer can overcome its deficiencies.Keywords: non-relational, relational, MySQL, mitigate, Firestore, SQL, NoSQL, serverless, database, GraphQL
Procedia PDF Downloads 6212953 2D Fingerprint Performance for PubChem Chemical Database
Authors: Fatimah Zawani Abdullah, Shereena Mohd Arif, Nurul Malim
Abstract:
The study of molecular similarity search in chemical database is increasingly widespread, especially in the area of drug discovery. Similarity search is an application in the field of Chemoinformatics to measure the similarity between the molecular structure which is known as the query and the structure of chemical compounds in the database. Similarity search is also one of the approaches in virtual screening which involves computational techniques and scoring the probabilities of activity. The main objective of this work is to determine the best fingerprint when compared to the other five fingerprints selected in this study using PubChem chemical dataset. This paper will discuss the similarity searching process conducted using 6 types of descriptors, which are ECFP4, ECFC4, FCFP4, FCFC4, SRECFC4 and SRFCFC4 on 15 activity classes of PubChem dataset using Tanimoto coefficient to calculate the similarity between the query structures and each of the database structure. The results suggest that ECFP4 performs the best to be used with Tanimoto coefficient in the PubChem dataset.Keywords: 2D fingerprints, Tanimoto, PubChem, similarity searching, chemoinformatics
Procedia PDF Downloads 29312952 Improved Image Retrieval for Efficient Localization in Urban Areas Using Location Uncertainty Data
Authors: Mahdi Salarian, Xi Xu, Rashid Ansari
Abstract:
Accurate localization of mobile devices based on camera-acquired visual media information usually requires a search over a very large GPS-referenced image database. This paper proposes an efficient method for limiting the search space for image retrieval engine by extracting and leveraging additional media information about Estimated Positional Error (EP E) to address complexity and accuracy issues in the search, especially to be used for compensating GPS location inaccuracy in dense urban areas. The improved performance is achieved by up to a hundred-fold reduction in the search area used in available reference methods while providing improved accuracy. To test our procedure we created a database by acquiring Google Street View (GSV) images for down town of Chicago. Other available databases are not suitable for our approach due to lack of EP E for the query images. We tested the procedure using more than 200 query images along with EP E acquired mostly in the densest areas of Chicago with different phones and in different conditions such as low illumination and from under rail tracks. The effectiveness of our approach and the effect of size and sector angle of the search area are discussed and experimental results demonstrate how our proposed method can improve performance just by utilizing a data that is available for mobile systems such as smart phones.Keywords: localization, retrieval, GPS uncertainty, bag of word
Procedia PDF Downloads 28312951 Programming Language Extension Using Structured Query Language for Database Access
Authors: Chapman Eze Nnadozie
Abstract:
Relational databases constitute a very vital tool for the effective management and administration of both personal and organizational data. Data access ranges from a single user database management software to a more complex distributed server system. This paper intends to appraise the use a programming language extension like structured query language (SQL) to establish links to a relational database (Microsoft Access 2013) using Visual C++ 9 programming language environment. The methodology used involves the creation of tables to form a database using Microsoft Access 2013, which is Object Linking and Embedding (OLE) database compliant. The SQL command is used to query the tables in the database for easy extraction of expected records inside the visual C++ environment. The findings of this paper reveal that records can easily be accessed and manipulated to filter exactly what the user wants, such as retrieval of records with specified criteria, updating of records, and deletion of part or the whole records in a table.Keywords: data access, database, database management system, OLE, programming language, records, relational database, software, SQL, table
Procedia PDF Downloads 18712950 Performance-Based Quality Evaluation of Database Conceptual Schemas
Authors: Janusz Getta, Zhaoxi Pan
Abstract:
Performance-based quality evaluation of database conceptual schemas is an important aspect of database design process. It is evident that different conceptual schemas provide different logical schemas and performance of user applications strongly depends on logical and physical database structures. This work presents the entire process of performance-based quality evaluation of conceptual schemas. First, we show format. Then, the paper proposes a new specification of object algebra for representation of conceptual level database applications. Transformation of conceptual schemas and expression of object algebra into implementation schema and implementation in a particular database system allows for precise estimation of the processing costs of database applications and as a consequence for precise evaluation of performance-based quality of conceptual schemas. Then we describe an experiment as a proof of concept for the evaluation procedure presented in the paper.Keywords: conceptual schema, implementation schema, logical schema, object algebra, performance evaluation, query processing
Procedia PDF Downloads 29212949 Graph-Oriented Summary for Optimized Resource Description Framework Graphs Streams Processing
Authors: Amadou Fall Dia, Maurras Ulbricht Togbe, Aliou Boly, Zakia Kazi Aoul, Elisabeth Metais
Abstract:
Existing RDF (Resource Description Framework) Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation, IoT applications, drinking water distribution management, and so on. However, processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient results in terms of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones.Keywords: centrality measures, RDF graphs summary, RDF graphs stream, SPARQL query
Procedia PDF Downloads 20312948 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an
Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett
Abstract:
Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing
Procedia PDF Downloads 31812947 Perusing the Influence of a Visual Editor in Enabling PostgreSQL Query Learn-Ability
Authors: Manuela Nayantara Jeyaraj
Abstract:
PostgreSQL is an Object-Relational Database Management System (ORDBMS) with an architecture that ensures optimal quality data management. But due to the shading growth of similar ORDBMS, PostgreSQL has not been renowned among the database user community. Despite having its features and in-built functionalities shadowed, PostgreSQL renders a vast range of utilities for data manipulation and hence calling for it to be upheld more among users. But introducing PostgreSQL in order to stimulate its advantageous features among users, mandates endorsing learn-ability as an add-on as the target groups considered consist of both amateur as well as professional PostgreSQL users. The scope of this paper deliberates providing easy contemplation of query formulations and flows through a visual editor designed according to user interface principles that standby to support every aspect of making PostgreSQL learn-able by self-operation and creation of queries within the visual editor. This paper tends to scrutinize the importance of choosing PostgreSQL as the working database environment, the visual perspectives that influence human behaviour and ultimately learning, the modes in which learn-ability can be provided via visualization and the advantages reaped by the implementation of the proposed system features.Keywords: database, learn-ability, PostgreSQL, query, visual-editor
Procedia PDF Downloads 17412946 A Framework for SQL Learning: Linking Learning Taxonomy, Cognitive Model and Cross Cutting Factors
Authors: Huda Al Shuaily, Karen Renaud
Abstract:
Databases comprise the foundation of most software systems. System developers inevitably write code to query these databases. The de facto language for querying is SQL and this, consequently, is the default language taught by higher education institutions. There is evidence that learners find it hard to master SQL, harder than mastering other programming languages such as Java. Educators do not agree about explanations for this seeming anomaly. Further investigation may well reveal the reasons. In this paper, we report on our investigations into how novices learn SQL, the actual problems they experience when writing SQL, as well as the differences between expert and novice SQL query writers. We conclude by presenting a model of SQL learning that should inform the instructional material design process better to support the SQL learning process.Keywords: pattern, SQL, learning, model
Procedia PDF Downloads 254