Search results for: data contents search
8326 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction
Authors: Qais M. Yousef, Yasmeen A. Alshaer
Abstract:
Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.
Keywords: Artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9208325 Data Extraction of XML Files using Searching and Indexing Techniques
Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare
Abstract:
XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Keywords: XML Retrieval, Indexed Search, Information Retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17828324 Matching Current Search with Future Postings
Authors: Kim Nee Goh, Viknesh Kumar Naleyah
Abstract:
Online trading is an alternative to conventional shopping method. People trade goods which are new or pre-owned before. However, there are times when a user is not able to search the items wanted online. This is because the items may not be posted as yet, thus ending the search. Conventional search mechanism only works by searching and matching search criteria (requirement) with data available in a particular database. This research aims to match current search requirements with future postings. This would involve the time factor in the conventional search method. A Car Matching Alert System (CMAS) prototype was developed to test the matching algorithm. When a buyer-s search returns no result, the system saves the search and the buyer will be alerted if there is a match found based on future postings. The algorithm developed is useful and as it can be applied in other search context.
Keywords: Matching algorithm, online trading, search, future postings, car matching
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14248323 A Novel Approach to Improve Users Search Goal in Web Usage Mining
Authors: R. Lokeshkumar, P. Sengottuvelan
Abstract:
Web mining is to discover and extract useful Information. Different users may have different search goals when they search by giving queries and submitting it to a search engine. The inference and analysis of user search goals can be very useful for providing an experience result for a user search query. In this project, we propose a novel approach to infer user search goals by analyzing search web logs. First, we propose a novel approach to infer user search goals by analyzing search engine query logs, the feedback sessions are constructed from user click-through logs and it efficiently reflect the information needed for users. Second we propose a preprocessing technique to clean the unnecessary data’s from web log file (feedback session). Third we propose a technique to generate pseudo-documents to representation of feedback sessions for clustering. Finally we implement k-medoids clustering algorithm to discover different user search goals and to provide a more optimal result for a search query based on feedback sessions for the user.Keywords: Data Preprocessing, Session Identification, Web log mining, Web Personalization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20228322 Object Identification with Color, Texture, and Object-Correlation in CBIR System
Authors: Awais Adnan, Muhammad Nawaz, Sajid Anwar, Tamleek Ali, Muhammad Ali
Abstract:
Needs of an efficient information retrieval in recent years in increased more then ever because of the frequent use of digital information in our life. We see a lot of work in the area of textual information but in multimedia information, we cannot find much progress. In text based information, new technology of data mining and data marts are now in working that were started from the basic concept of database some where in 1960. In image search and especially in image identification, computerized system at very initial stages. Even in the area of image search we cannot see much progress as in the case of text based search techniques. One main reason for this is the wide spread roots of image search where many area like artificial intelligence, statistics, image processing, pattern recognition play their role. Even human psychology and perception and cultural diversity also have their share for the design of a good and efficient image recognition and retrieval system. A new object based search technique is presented in this paper where object in the image are identified on the basis of their geometrical shapes and other features like color and texture where object-co-relation augments this search process. To be more focused on objects identification, simple images are selected for the work to reduce the role of segmentation in overall process however same technique can also be applied for other images.Keywords: Object correlation, Geometrical shape, Color, texture, features, contents.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20278321 A Study on the Mobile Web Generating using Element of User Experience
Authors: Heeae Ko, Jongkeun Kim, Kunjung Sim, Kunho Sim, Yonghwan Lim
Abstract:
As mobile service's subscriber is increasing; mobile contents services are getting more and more variables. So, mobile contents development needs not only contents design but also guideline for just mobile. And when mobile contents are developed, it is important to pass the limit and restriction of the mobile. The restrictions of mobile are small browser and screen size, limited download size and uncomfortable navigation. So each contents of mobile guideline will be presented for user's usability, easy of development and consistency of rule. This paper will be proposed methodology which is each contents of mobile guideline. Mobile web will be developed by mobile guideline which I proposed.Keywords: Guideline, interface, mobile, mobile computing, userexperience.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16638320 High Speed Bitwise Search for Digital Forensic System
Authors: Hyungkeun Jee, Jooyoung Lee, Dowon Hong
Abstract:
The most common forensic activity is searching a hard disk for string of data. Nowadays, investigators and analysts are increasingly experiencing large, even terabyte sized data sets when conducting digital investigations. Therefore consecutive searching can take weeks to complete successfully. There are two primary search methods: index-based search and bitwise search. Index-based searching is very fast after the initial indexing but initial indexing takes a long time. In this paper, we discuss a high speed bitwise search model for large-scale digital forensic investigations. We used pattern matching board, which is generally used for network security, to search for string and complex regular expressions. Our results indicate that in many cases, the use of pattern matching board can substantially increase the performance of digital forensic search tools.Keywords: Digital forensics, search, regular expression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18058319 Categorizing Search Result Records Using Word Sense Disambiguation
Authors: R. Babisaraswathi, N. Shanthi, S. S. Kiruthika
Abstract:
Web search engines are designed to retrieve and extract the information in the web databases and to return dynamic web pages. The Semantic Web is an extension of the current web in which it includes semantic content in web pages. The main goal of semantic web is to promote the quality of the current web by changing its contents into machine understandable form. Therefore, the milestone of semantic web is to have semantic level information in the web. Nowadays, people use different keyword- based search engines to find the relevant information they need from the web. But many of the words are polysemous. When these words are used to query a search engine, it displays the Search Result Records (SRRs) with different meanings. The SRRs with similar meanings are grouped together based on Word Sense Disambiguation (WSD). In addition to that semantic annotation is also performed to improve the efficiency of search result records. Semantic Annotation is the process of adding the semantic metadata to web resources. Thus the grouped SRRs are annotated and generate a summary which describes the information in SRRs. But the automatic semantic annotation is a significant challenge in the semantic web. Here ontology and knowledge based representation are used to annotate the web pages.
Keywords: Ontology, Semantic Web, WordNet, Word Sense Disambiguation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17628318 EnArgus: A Knowledge-Based Search Application for Energy Research Projects
Authors: Frederike Ohrem, Lukas Sikorski, Bastian Haarmann
Abstract:
Often the users of a semantic search application are facing the problem that they do not find appropriate terms for their search. This holds especially if the data to be searched is from a technical field in which the user does not have expertise. In order to support the user finding the results he seeks, we developed a domain-specific ontology and implemented it into a search application. The ontology serves as a knowledge base, suggesting technical terms to the user which he can add to his query. In this paper, we present the search application and the underlying ontology as well as the project EnArgus in which the application was developed.
Keywords: Information system, knowledge representation, ontology, semantic search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17298317 A Context-Sensitive Algorithm for Media Similarity Search
Authors: Guang-Ho Cha
Abstract:
This paper presents a context-sensitive media similarity search algorithm. One of the central problems regarding media search is the semantic gap between the low-level features computed automatically from media data and the human interpretation of them. This is because the notion of similarity is usually based on high-level abstraction but the low-level features do not sometimes reflect the human perception. Many media search algorithms have used the Minkowski metric to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information given by images in a collection. Our search algorithm tackles this problem by employing a similarity measure and a ranking strategy that reflect the nonlinearity of human perception and contextual information in a dataset. Similarity search in an image database based on this contextual information shows encouraging experimental results.
Keywords: Context-sensitive search, image search, media search, similarity ranking, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6398316 Quick Sequential Search Algorithm Used to Decode High-Frequency Matrices
Authors: Mohammed M. Siddeq, Mohammed H. Rasheed, Omar M. Salih, Marcos A. Rodrigues
Abstract:
This research proposes a data encoding and decoding method based on the Matrix Minimization algorithm. This algorithm is applied to high-frequency coefficients for compression/encoding. The algorithm starts by converting every three coefficients to a single value; this is accomplished based on three different keys. The decoding/decompression uses a search method called QSS (Quick Sequential Search) Decoding Algorithm presented in this research based on the sequential search to recover the exact coefficients. In the next step, the decoded data are saved in an auxiliary array. The basic idea behind the auxiliary array is to save all possible decoded coefficients; this is because another algorithm, such as conventional sequential search, could retrieve encoded/compressed data independently from the proposed algorithm. The experimental results showed that our proposed decoding algorithm retrieves original data faster than conventional sequential search algorithms.
Keywords: Matrix Minimization Algorithm, Decoding Sequential Search Algorithm, image compression, Discrete Cosine Transform, Discrete Wavelet Transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2468315 Image-Based (RBG) Technique for Estimating Phosphorus Levels of Crops
Authors: M. M. Ali, Ahmed Al-Ani, Derek Eamus, Daniel K. Y. Tan
Abstract:
In this glasshouse study, we developed a new imagebased non-destructive technique for detecting leaf P status of different crops such as cotton, tomato and lettuce. The plants were grown on a nutrient solution containing different P concentrations, e.g. 0%, 50% and 100% of recommended P concentration (P0 = no P, L; P1 = 2.5 mL 10 L-1 of P and P2 = 5 mL 10 L-1 of P). After 7 weeks of treatment, the plants were harvested and data on leaf P contents were collected using the standard destructive laboratory method and at the same time leaf images were collected by a handheld crop image sensor. We calculated leaf area, leaf perimeter and RGB (red, green and blue) values of these images. These data were further used in linear discriminant analysis (LDA) to estimate leaf P contents, which successfully classified these plants on the basis of leaf P contents. The data indicated that P deficiency in crop plants can be predicted using leaf image and morphological data. Our proposed nondestructive imaging method is precise in estimating P requirements of different crop species.Keywords: Image-based techniques, leaf area, leaf P contents, linear discriminant analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16498314 New Enhanced Hexagon-Based Search Using Point-Oriented Inner Search for Fast Block Motion Estimation
Authors: Lai-Man Po, Chi-Wang Ting, Ka-Ho Ng
Abstract:
Recently, an enhanced hexagon-based search (EHS) algorithm was proposed to speedup the original hexagon-based search (HS) by exploiting the group-distortion information of some evaluated points. In this paper, a second version of the EHS is proposed with a new point-oriented inner search technique which can further speedup the HS in both large and small motion environments. Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.Keywords: Inner search, fast motion estimation, block-matching, hexagon search
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14298313 Choosing Search Algorithms in Bayesian Optimization Algorithm
Authors: Hao Wu, Jonathan L. Shapiro
Abstract:
The Bayesian Optimization Algorithm (BOA) is an algorithm based on the estimation of distributions. It uses techniques from modeling data by Bayesian networks to estimating the joint distribution of promising solutions. To obtain the structure of Bayesian network, different search algorithms can be used. The key point that BOA addresses is whether the constructed Bayesian network could generate new and useful solutions (strings), which could lead the algorithm in the right direction to solve the problem. Undoubtedly, this ability is a crucial factor of the efficiency of BOA. Varied search algorithms can be used in BOA, but their performances are different. For choosing better ones, certain suitable method to present their ability difference is needed. In this paper, a greedy search algorithm and a stochastic search algorithm are used in BOA to solve certain optimization problem. A method using Kullback-Leibler (KL) Divergence to reflect their difference is described.
Keywords: Bayesian optimization algorithm, greedy search, KL divergence, stochastic search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16978312 An Improved Fast Search Method Using Histogram Features for DNA Sequence Database
Authors: Qiu Chen, Feifei Lee, Koji Kotani, Tadahiro Ohmi
Abstract:
In this paper, we propose an efficient hierarchical DNA sequence search method to improve the search speed while the accuracy is being kept constant. For a given query DNA sequence, firstly, a fast local search method using histogram features is used as a filtering mechanism before scanning the sequences in the database. An overlapping processing is newly added to improve the robustness of the algorithm. A large number of DNA sequences with low similarity will be excluded for latter searching. The Smith-Waterman algorithm is then applied to each remainder sequences. Experimental results using GenBank sequence data show the proposed method combining histogram information and Smith-Waterman algorithm is more efficient for DNA sequence search.Keywords: Fast search, DNA sequence, Histogram feature, Smith-Waterman algorithm, Local search
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13288311 Web Search Engine Based Naming Procedure for Independent Topic
Authors: Takahiro Nishigaki, Takashi Onoda
Abstract:
In recent years, the number of document data has been increasing since the spread of the Internet. Many methods have been studied for extracting topics from large document data. We proposed Independent Topic Analysis (ITA) to extract topics independent of each other from large document data such as newspaper data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis. The topic represented by ITA is represented by a set of words. However, the set of words is quite different from the topics the user imagines. For example, the top five words with high independence of a topic are as follows. Topic1 = {"scor", "game", "lead", "quarter", "rebound"}. This Topic 1 is considered to represent the topic of "SPORTS". This topic name "SPORTS" has to be attached by the user. ITA cannot name topics. Therefore, in this research, we propose a method to obtain topics easy for people to understand by using the web search engine, topics given by the set of words given by independent topic analysis. In particular, we search a set of topical words, and the title of the homepage of the search result is taken as the topic name. And we also use the proposed method for some data and verify its effectiveness.Keywords: Independent topic analysis, topic extraction, topic naming, web search engine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5008310 Personalization of Web Search Using Web Page Clustering Technique
Authors: Amol Bapuso Rajmane, Pradeep M. Patil, Prakash J. Kulkarni
Abstract:
The Information Retrieval community is facing the problem of effective representation of Web search results. When we organize web search results into clusters it becomes easy to the users to quickly browse through search results. The traditional search engines organize search results into clusters for ambiguous queries, representing each cluster for each meaning of the query. The clusters are obtained according to the topical similarity of the retrieved search results, but it is possible for results to be totally dissimilar and still correspond to the same meaning of the query. People search is also one of the most common tasks on the Web nowadays, but when a particular person’s name is queried the search engines return web pages which are related to different persons who have the same queried name. By placing the burden on the user of disambiguating and collecting pages relevant to a particular person, in this paper, we have developed an approach that clusters web pages based on the association of the web pages to the different people and clusters that are based on generic entity search.
Keywords: Entity resolution, information retrieval, graph based disambiguation, web people search, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15028309 Information Retrieval in Domain Specific Search Engine with Machine Learning Approaches
Authors: Shilpy Sharma
Abstract:
As the web continues to grow exponentially, the idea of crawling the entire web on a regular basis becomes less and less feasible, so the need to include information on specific domain, domain-specific search engines was proposed. As more information becomes available on the World Wide Web, it becomes more difficult to provide effective search tools for information access. Today, people access web information through two main kinds of search interfaces: Browsers (clicking and following hyperlinks) and Query Engines (queries in the form of a set of keywords showing the topic of interest) [2]. Better support is needed for expressing one's information need and returning high quality search results by web search tools. There appears to be a need for systems that do reasoning under uncertainty and are flexible enough to recover from the contradictions, inconsistencies, and irregularities that such reasoning involves. In a multi-view problem, the features of the domain can be partitioned into disjoint subsets (views) that are sufficient to learn the target concept. Semi-supervised, multi-view algorithms, which reduce the amount of labeled data required for learning, rely on the assumptions that the views are compatible and uncorrelated. This paper describes the use of semi-structured machine learning approach with Active learning for the “Domain Specific Search Engines". A domain-specific search engine is “An information access system that allows access to all the information on the web that is relevant to a particular domain. The proposed work shows that with the help of this approach relevant data can be extracted with the minimum queries fired by the user. It requires small number of labeled data and pool of unlabelled data on which the learning algorithm is applied to extract the required data.Keywords: Search engines; machine learning, Informationretrieval, Active logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20828308 In Search of Excellence – Google vs Baidu
Authors: Linda, Sau-ling LAI
Abstract:
This paper compares the search engine marketing strategies adopted in China and the Western countries through two illustrative cases, namely, Google and Baidu. Marketers in the West use search engine optimization (SEO) to rank their sites higher for queries in Google. Baidu, however, offers paid search placement, or the selling of engine results for particular keywords to the higher bidders. Whereas Google has been providing innovative services ranging from Google Map to Google Blog, Baidu remains focused on search services – the one that it does best. The challenges and opportunities of the Chinese Internet market offered to global entrepreneurs are also discussed in the paperKeywords: Search Engine, Web analytics, Google, Baidu
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24558307 Motion Area Estimated Motion Estimation with Triplet Search Patterns for H.264/AVC
Authors: T. Song, T. Shimamoto
Abstract:
In this paper a fast motion estimation method for H.264/AVC named Triplet Search Motion Estimation (TS-ME) is proposed. Similar to some of the traditional fast motion estimation methods and their improved proposals which restrict the search points only to some selected candidates to decrease the computation complexity, proposed algorithm separate the motion search process to several steps but with some new features. First, proposed algorithm try to search the real motion area using proposed triplet patterns instead of some selected search points to avoid dropping into the local minimum. Then, in the localized motion area a novel 3-step motion search algorithm is performed. Proposed search patterns are categorized into three rings on the basis of the distance from the search center. These three rings are adaptively selected by referencing the surrounding motion vectors to early terminate the motion search process. On the other hand, computation reduction for sub pixel motion search is also discussed considering the appearance probability of the sub pixel motion vector. From the simulation results, motion estimation speed improved by a factor of up to 38 when using proposed algorithm than that of the reference software of H.264/AVC with ignorable picture quality loss.Keywords: Motion estimation, VLSI, image processing, search patterns
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13318306 Fast Search for MPEG Video Clips Using Adjacent Pixel Intensity Difference Quantization Histogram Feature
Authors: Feifei Lee, Qiu Chen, Koji Kotani, Tadahiro Ohmi
Abstract:
In this paper, we propose a novel fast search algorithm for short MPEG video clips from video database. This algorithm is based on the adjacent pixel intensity difference quantization (APIDQ) algorithm, which had been reliably applied to human face recognition previously. An APIDQ histogram is utilized as the feature vector of the frame image. Instead of fully decompressed video frames, partially decoded data, namely DC images are utilized. Combined with active search [4], a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by 6 hours of video to search for given 200 MPEG video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 3 % is achieved, which is more accurately and robust than conventional fast video search algorithm.
Keywords: Fast search, adjacent pixel intensity difference quantization (APIDQ), DC image, histogram feature.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15788305 Joint Adaptive Block Matching Search (JABMS) Algorithm
Authors: V.K.Ananthashayana, Pushpa.M.K
Abstract:
In this paper a new Joint Adaptive Block Matching Search (JABMS) algorithm is proposed to generate motion vector and search a best match macro block by classifying the motion vector movement based on prediction error. Diamond Search (DS) algorithm generates high estimation accuracy when motion vector is small and Adaptive Rood Pattern Search (ARPS) algorithm can handle large motion vector but is not very accurate. The proposed JABMS algorithm which is capable of considering both small and large motions gives improved estimation accuracy and the computational cost is reduced by 15.2 times compared with Exhaustive Search (ES) algorithm and is 1.3 times less compared with Diamond search algorithm.Keywords: Adaptive rood pattern search, Block matching, Diamond search, Joint Adaptive search, Motion estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16918304 A Development of Personalized Edutainment Contents through Storytelling
Authors: Min Kyeong Cha, Ju Yeon Mun, Seong Baeg Kim
Abstract:
Recently, ‘play of learning’ becomes important and is emphasized as a useful learning tool. Therefore, interest in edutainment contents is growing. Storytelling is considered first as a method that improves the transmission of information and learner's interest when planning edutainment contents. In this study, we designed edutainment contents in the form of an adventure game that applies the storytelling method. This content provides questions and items constituted dynamically and reorganized learning contents through analysis of test results. It allows learners to solve various questions through effective iterative learning. As a result, the learners can reach mastery learning.
Keywords: Storytelling, edutainment, mastery learning, computer operating principle.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18458303 Linking OpenCourseWares and Open Education Resources: Creating an Effective Search and Recommendation System
Authors: Brett E. Shelton, Joel Duffin, Yuxuan Wang, Justin Ball
Abstract:
With a growing number of digital libraries and other open education repositories being made available throughout the world, effective search and retrieval tools are necessary to access the desired materials that surpass the effectiveness of traditional, allinclusive search engines. This paper discusses the design and use of Folksemantic, a platform that integrates OpenCourseWare search, Open Educational Resource recommendations, and social network functionality into a single open source project. The paper describes how the system was originally envisioned, its goals for users, and data that provides insight into how it is actually being used. Data sources include website click-through data, query logs, web server log files and user account data. Based on a descriptive analysis of its current use, modifications to the platform's design are recommended to better address goals of the system, along with recommendations for additional phases of research.Keywords: Digital libraries, open education, recommendation system, social networks
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21998302 Fast Codevector Search Algorithm for 3-D Vector Quantized Codebook
Authors: H. B. Kekre, Tanuja K. Sarode
Abstract:
This paper presents a very simple and efficient algorithm for codebook search, which reduces a great deal of computation as compared to the full codebook search. The algorithm is based on sorting and centroid technique for search. The results table shows the effectiveness of the proposed algorithm in terms of computational complexity. In this paper we also introduce a new performance parameter named as Average fractional change in pixel value as we feel that it gives better understanding of the closeness of the image since it is related to the perception. This new performance parameter takes into consideration the average fractional change in each pixel value.Keywords: Vector Quantization, Data Compression, Encoding, Searching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16118301 Meta-Search in Human Resource Management
Authors: Jürgen Dorn, Tabbasum Naz
Abstract:
In the area of Human Resource Management, the trend is towards online exchange of information about human resources. For example, online applications for employment become standard and job offerings are posted in many job portals. However, there are too many job portals to monitor all of them if someone is interested in a new job. We developed a prototype for integrating information of different job portals into one meta-search engine. First, existing job portals were investigated and XML schema documents were derived automated from these portals. Second, translation rules for transforming each schema to a central HR-XML-conform schema were determined. The HR-XML-schema is used to build a form for searching jobs. The data supplied by a user in this form is now translated into queries for the different job portals. Each result obtained by a job portal is sent to the meta-search engine that ranks the result of all received job offers according to user's preferences.Keywords: Meta-search, Information extraction and integration, human resource management, job search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16938300 Online Topic Model for Broadcasting Contents Using Semantic Correlation Information
Authors: Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park, Sang-Jo Lee
Abstract:
This paper proposes a method of learning topics for broadcasting contents. There are two kinds of texts related to broadcasting contents. One is a broadcasting script, which is a series of texts including directions and dialogues. The other is blogposts, which possesses relatively abstracted contents, stories, and diverse information of broadcasting contents. Although two texts range over similar broadcasting contents, words in blogposts and broadcasting script are different. When unseen words appear, it needs a method to reflect to existing topic. In this paper, we introduce a semantic vocabulary expansion method to reflect unseen words. We expand topics of the broadcasting script by incorporating the words in blogposts. Each word in blogposts is added to the most semantically correlated topics. We use word2vec to get the semantic correlation between words in blogposts and topics of scripts. The vocabularies of topics are updated and then posterior inference is performed to rearrange the topics. In experiments, we verified that the proposed method can discover more salient topics for broadcasting contents.
Keywords: Broadcasting script analysis, topic expansion, semantic correlation analysis, word2vec.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17608299 Language and Retrieval Accuracy
Authors: Ahmed Abdelali, Jim Cowie, Hamdy S. Soliman
Abstract:
One of the major challenges in the Information Retrieval field is handling the massive amount of information available to Internet users. Existing ranking techniques and strategies that govern the retrieval process fall short of expected accuracy. Often relevant documents are buried deep in the list of documents returned by the search engine. In order to improve retrieval accuracy we examine the issue of language effect on the retrieval process. Then, we propose a solution for a more biased, user-centric relevance for retrieved data. The results demonstrate that using indices based on variations of the same language enhances the accuracy of search engines for individual users.Keywords: Information Search and Retrieval, LanguageVariants, Search Engine, Retrieval Accuracy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14768298 Searchable Encryption in Cloud Storage
Authors: Ren-Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.
Keywords: Fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30808297 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561