Search results for: adhoc retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 338

Search results for: adhoc retrieval

158 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 356
157 Students’ Willingness to Use Public Computing Facilities at a Library

Authors: Norbayah Mohd Suki, Norazah Mohd Suki

Abstract:

This study aims to examine relationships between attitude, self-efficacy, and subjective norm with students’ behavioural intention to use public computing facilities at a library. Data was collected from 200 undergraduate students enrolled at a higher learning institution in the Federal Territory of Labuan, Malaysia via a structured questionnaire comprising closed-ended questions. Data was analyzed using multiple regression analysis. The results show that students’ behavioural intention to use public computing facilities at the library is widely affected by subjective norm factor i.e. influence of the support of family members, friends and neighbours. The findings of this study provide a better understanding of factors likely to influence students’ behavioural intention to use public computing facilities at a library. It also offers valuable insights into factors which university librarians need to focus on to improve students’ behavioural intention to actively use public computing facilities at a library for quality information retrieval. Direction for future research is also presented.

Keywords: attitude, self-efficacy, subjective norm, behavioural intention

Procedia PDF Downloads 419
156 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach

Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal

Abstract:

Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.

Keywords: e-learning, cluster, personalization, sequence, pattern

Procedia PDF Downloads 403
155 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability

Procedia PDF Downloads 303
154 Programming Language Extension Using Structured Query Language for Database Access

Authors: Chapman Eze Nnadozie

Abstract:

Relational databases constitute a very vital tool for the effective management and administration of both personal and organizational data. Data access ranges from a single user database management software to a more complex distributed server system. This paper intends to appraise the use a programming language extension like structured query language (SQL) to establish links to a relational database (Microsoft Access 2013) using Visual C++ 9 programming language environment. The methodology used involves the creation of tables to form a database using Microsoft Access 2013, which is Object Linking and Embedding (OLE) database compliant. The SQL command is used to query the tables in the database for easy extraction of expected records inside the visual C++ environment. The findings of this paper reveal that records can easily be accessed and manipulated to filter exactly what the user wants, such as retrieval of records with specified criteria, updating of records, and deletion of part or the whole records in a table.

Keywords: data access, database, database management system, OLE, programming language, records, relational database, software, SQL, table

Procedia PDF Downloads 161
153 Key Frame Based Video Summarization via Dependency Optimization

Authors: Janya Sainui

Abstract:

As a rapid growth of digital videos and data communications, video summarization that provides a shorter version of the video for fast video browsing and retrieval is necessary. Key frame extraction is one of the mechanisms to generate video summary. In general, the extracted key frames should both represent the entire video content and contain minimum redundancy. However, most of the existing approaches heuristically select key frames; hence, the selected key frames may not be the most different frames and/or not cover the entire content of a video. In this paper, we propose a method of video summarization which provides the reasonable objective functions for selecting key frames. In particular, we apply a statistical dependency measure called quadratic mutual informaion as our objective functions for maximizing the coverage of the entire video content as well as minimizing the redundancy among selected key frames. The proposed key frame extraction algorithm finds key frames as an optimization problem. Through experiments, we demonstrate the success of the proposed video summarization approach that produces video summary with better coverage of the entire video content while less redundancy among key frames comparing to the state-of-the-art approaches.

Keywords: video summarization, key frame extraction, dependency measure, quadratic mutual information

Procedia PDF Downloads 248
152 miCoRe: Colorectal Cancer miRNAs Database

Authors: Rahul Agarwal, Ashutosh Singh

Abstract:

Colorectal cancer (CRC) also refers as bowel cancer or colon cancer. It involves the development of abnormal growth of cells in colon or rectum part of the body. This work leads to the development of a miRNA database in colorectal cancer. We named this database- miCoRe. This database comprises of all validated colon-rectal cancer miRNAs information from various published literature with an effectual knowledge based information retrieval system. miRNAs have been collected from various published literature reports. MySQL is used for main-framework of miCoRe while the front-end was developed in PHP script. The aim of developing miCoRe is to create a comprehensive central repository of colorectal carcinoma miRNAs with all germane information of miRNAs and their target genes. The current version of miCoRe consists of 238 miRNAs which are known to be implicated in malignancy of CRC. Alongside with miRNA information, miCoRe also contains the information related to the target genes of these miRNA. miCoRe furnishes the information about the mechanism of incidence and progression of the disease, which would further help the researchers to look for colorectal specific miRNAs therapies and CRC specific targeted drug designing. Moreover, it will also help in development of biomarkers for the better and early detection of CRC and will help in better clinical management of the disease.

Keywords: colorectal cancer, database, miCoRe, miRNAs

Procedia PDF Downloads 251
151 A Correlation Between Perceived Usage of Project Management Methodologies and Project Success in Horizon 2020 Projects

Authors: Aurelio Palacardo, Giulio Mangano, Alberto De Marco

Abstract:

Nowadays, the global economic framework is extremely competitive, and it consequently requires an efficient deployment of the resources provided by EU. In this context, Project management practices are intended to be one of the levers for increasing such an efficiency. The objective of this work is to explore the usage of Project Management methodologies and good practices in the European-wide research program “Horizon2020” and establish whether their maturity might impact the project's success. This allows to identify strengths in terms of application of PM methodologies and good practices and, in turn, to provide feedback and opportunities for improvements to be implemented in future programs. In order to achieve this objective, the present research makes use of a survey-based data retrieval and correlation analysis to investigate the level of perceived PM maturity in H2020 projects and the correlation of maturity with project success. The results show the Project Managers involved in H2020 to hold a high level of PM maturity, confirming PM standards, which are imposed by the EU commission as a binding process, are effectively enforced.

Keywords: project management, project management maturity, maturity models, project success

Procedia PDF Downloads 134
150 Impact of Similarity Ratings on Human Judgement

Authors: Ian A. McCulloh, Madelaine Zinser, Jesse Patsolic, Michael Ramos

Abstract:

Recommender systems are a common artificial intelligence (AI) application. For any given input, a search system will return a rank-ordered list of similar items. As users review returned items, they must decide when to halt the search and either revise search terms or conclude their requirement is novel with no similar items in the database. We present a statistically designed experiment that investigates the impact of similarity ratings on human judgement to conclude a search item is novel and halt the search. 450 participants were recruited from Amazon Mechanical Turk to render judgement across 12 decision tasks. We find the inclusion of ratings increases the human perception that items are novel. Percent similarity increases novelty discernment when compared with star-rated similarity or the absence of a rating. Ratings reduce the time to decide and improve decision confidence. This suggests the inclusion of similarity ratings can aid human decision-makers in knowledge search tasks.

Keywords: ratings, rankings, crowdsourcing, empirical studies, user studies, similarity measures, human-centered computing, novelty in information retrieval

Procedia PDF Downloads 92
149 Coronavirus Academic Paper Sorting Application

Authors: Christina A. van Hal, Xiaoqian Jiang, Luyao Chen, Yan Chu, Robert D. Jolly, Yaobin Lin, Jitian Zhao, Kang Lin Hsieh

Abstract:

The COVID-19 Literature Summary App was created for the primary purpose of enabling academicians and clinicians to quickly sort through the vast array of recent coronavirus publications by topics of interest. Multiple methods of summarizing and sorting the manuscripts were created. A summary page introduces the application function and capabilities, while an interactive map provides daily updates on infection, death, and recovery rates. A page with a pivot table allows publication sorting by topic, with an interactive data table that allows sorting topics by columns, as wells as the capability to view abstracts. Additionally, publications may be sorted by the medical topics they cover. We used the CORD-19 database to compile lists of publications. The data table can sort binary variables, allowing the user to pick desired publication topics, such as papers that describe COVID-19 symptoms. The application is primarily designed for use by researchers but can be used by anybody who wants a faster and more efficient means of locating papers of interest.

Keywords: COVID-19, literature summary, information retrieval, Snorkel

Procedia PDF Downloads 128
148 Effects of Aging on Auditory and Visual Recall Abilities

Authors: Rashmi D. G., Aishwarya G., Niharika M. K.

Abstract:

Purpose: Free recall tasks target cognitive and linguistic processes like episodic memory, lexical access and retrieval. Consequently, the free recall paradigm is suitable for assessing memory deterioration caused by aging; this also depends on linguistic factors, including the use of first and second languages and their relative ability. Hence, the present study aimed to determine if aging has an effect on visual and auditory recall abilities. Method: Twenty young adults (mean age: 25.4±0.99) and older adults (mean age: 63.3±3.51) participated in the study. Participants performed a free recall task under two conditions – related and unrelated and two modalities - visual and auditory where they were instructed to recall as many items as possible with no specific order and time limit. Results: Free recall performance was calculated as the mean number of correctly recalled items. Although younger participants recalled a higher number of items, the performance across conditions and modality was variable. Conclusion: In summary, the findings of the present study revealed an age-related decline in the efficiency of episodic memory, which is crucial to remember recent events.

Keywords: recall, episodic memory, aging, modality

Procedia PDF Downloads 68
147 Quantitative Phase Imaging System Based on a Three-Lens Common-Path Interferometer

Authors: Alexander Machikhin, Olga Polschikova, Vitold Pozhar, Alina Ramazanova

Abstract:

White-light quantitative phase imaging is an effective technique for achieving sub-nanometer phase sensitivity. Highly stable interferometers based on common-path geometry have been developed in recent years to solve this task. Some of these methods also apply multispectral approach. The purpose of this research is to suggest a simple and effective interferometer for such systems. We developed a three-lens common-path interferometer, which can be used for quantitative phase imaging with or without multispectral modality. The lens system consists of two components, the first one of which is a compound lens, consisting of two lenses. A pinhole is placed between the components. The lens-in-lens approach enables effective light transmission and high stability of the interferometer. The multispectrality is easily implemented by placing a tunable filter in front of the interferometer. In our work, we used an acousto-optical tunable filter. Some design considerations are discussed and multispectral quantitative phase retrieval is demonstrated.

Keywords: acousto-optical tunable filter, common-path interferometry, digital holography, multispectral quantitative phase imaging

Procedia PDF Downloads 283
146 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 287
145 Progressive Multimedia Collection Structuring via Scene Linking

Authors: Aman Berhe, Camille Guinaudeau, Claude Barras

Abstract:

In order to facilitate information seeking in large collections of multimedia documents with long and progressive content (such as broadcast news or TV series), one can extract the semantic links that exist between semantically coherent parts of documents, i.e., scenes. The links can then create a coherent collection of scenes from which it is easier to perform content analysis, topic extraction, or information retrieval. In this paper, we focus on TV series structuring and propose two approaches for scene linking at different levels of granularity (episode and season): a fuzzy online clustering technique and a graph-based community detection algorithm. When evaluated on the two first seasons of the TV series Game of Thrones, we found that the fuzzy online clustering approach performed better compared to graph-based community detection at the episode level, while graph-based approaches show better performance at the season level.

Keywords: multimedia collection structuring, progressive content, scene linking, fuzzy clustering, community detection

Procedia PDF Downloads 74
144 Bag of Local Features for Person Re-Identification on Large-Scale Datasets

Authors: Yixiu Liu, Yunzhou Zhang, Jianning Chi, Hao Chu, Rui Zheng, Libo Sun, Guanghao Chen, Fangtong Zhou

Abstract:

In the last few years, large-scale person re-identification has attracted a lot of attention from video surveillance since it has a potential application prospect in public safety management. However, it is still a challenging job considering the variation in human pose, the changing illumination conditions and the lack of paired samples. Although the accuracy has been significantly improved, the data dependence of the sample training is serious. To tackle this problem, a new strategy is proposed based on bag of visual words (BoVW) model of designing the feature representation which has been widely used in the field of image retrieval. The local features are extracted, and more discriminative feature representation is obtained by cross-view dictionary learning (CDL), then the assignment map is obtained through k-means clustering. Finally, the BoVW histograms are formed which encodes the images with the statistics of the feature classes in the assignment map. Experiments conducted on the CUHK03, Market1501 and MARS datasets show that the proposed method performs favorably against existing approaches.

Keywords: bag of visual words, cross-view dictionary learning, person re-identification, reranking

Procedia PDF Downloads 166
143 Digital Preservation in Nigeria Universities Libraries: A Comparison between University of Nigeria Nsukka and Ahmadu Bello University Zaria

Authors: Suleiman Musa, Shuaibu Sidi Safiyanu

Abstract:

This study examined the digital preservation in Nigeria university libraries. A comparison between the university of Nigeria Nsukka (UNN) and Ahmadu Bello University Zaria (ABU, Zaria). The study utilized primary source of data obtained from two selected institution librarians. Finding revealed varying results in terms of skills acquired by librarians before and after digitization of the two institutions. The study reports that journals publication, text book, CD-ROMS, conference papers and proceedings, theses, dissertations and seminar papers are among the information resources available for digitization. The study further documents that copyright issue, power failure, and unavailability of needed materials are among the challenges facing the digitization of library of the institution. On the basis of the finding, the study concluded that digitization of library enhances efficiency in organization and retrieval of information services. The study therefore recommended that software should be upgraded with backup, training of the librarians on digital process, installation of antivirus and enhancement of technical collaboration between the library and MIS.

Keywords: digitalization, preservation, libraries, comparison

Procedia PDF Downloads 307
142 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 290
141 Logistic Model Tree and Expectation-Maximization for Pollen Recognition and Grouping

Authors: Endrick Barnacin, Jean-Luc Henry, Jack Molinié, Jimmy Nagau, Hélène Delatte, Gérard Lebreton

Abstract:

Palynology is a field of interest for many disciplines. It has multiple applications such as chronological dating, climatology, allergy treatment, and even honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time-consuming task that requires the intervention of experts in the field, which is becoming increasingly rare due to economic and social conditions. So, the automation of this task is a necessity. Pollen slides analysis is mainly a visual process as it is carried out with the naked eye. That is the reason why a primary method to automate palynology is the use of digital image processing. This method presents the lowest cost and has relatively good accuracy in pollen retrieval. In this work, we propose a system combining recognition and grouping of pollen. It consists of using a Logistic Model Tree to classify pollen already known by the proposed system while detecting any unknown species. Then, the unknown pollen species are divided using a cluster-based approach. Success rates for the recognition of known species have been achieved, and automated clustering seems to be a promising approach.

Keywords: pollen recognition, logistic model tree, expectation-maximization, local binary pattern

Procedia PDF Downloads 156
140 Optimizing the Efficiency of Measuring Instruments in Ouagadougou-Burkina Faso

Authors: Moses Emetere, Marvel Akinyemi, S. E. Sanni

Abstract:

At the moment, AERONET or AMMA database shows a large volume of data loss. With only about 47% data set available to the scientist, it is evident that accurate nowcast or forecast cannot be guaranteed. The calibration constants of most radiosonde or weather stations are not compatible with the atmospheric conditions of the West African climate. A dispersion model was developed to incorporate salient mathematical representations like a Unified number. The Unified number was derived to describe the turbulence of the aerosols transport in the frictional layer of the lower atmosphere. Fourteen years data set from Multi-angle Imaging SpectroRadiometer (MISR) was tested using the dispersion model. A yearly estimation of the atmospheric constants over Ouagadougou using the model was obtained with about 87.5% accuracy. It further revealed that the average atmospheric constant for Ouagadougou-Niger is a_1 = 0.626, a_2 = 0.7999 and the tuning constants is n_1 = 0.09835 and n_2 = 0.266. Also, the yearly atmospheric constants affirmed the lower atmosphere of Ouagadougou is very dynamic. Hence, it is recommended that radiosonde and weather station manufacturers should constantly review the atmospheric constant over a geographical location to enable about eighty percent data retrieval.

Keywords: aerosols retention, aerosols loading, statistics, analytical technique

Procedia PDF Downloads 280
139 Human Action Recognition Using Wavelets of Derived Beta Distributions

Authors: Neziha Jaouedi, Noureddine Boujnah, Mohamed Salim Bouhlel

Abstract:

In the framework of human machine interaction systems enhancement, we focus throw this paper on human behavior analysis and action recognition. Human behavior is characterized by actions and reactions duality (movements, psychological modification, verbal and emotional expression). It’s worth noting that many information is hidden behind gesture, sudden motion points trajectories and speeds, many research works reconstructed an information retrieval issues. In our work we will focus on motion extraction, tracking and action recognition using wavelet network approaches. Our contribution uses an analysis of human subtraction by Gaussian Mixture Model (GMM) and body movement through trajectory models of motion constructed from kalman filter. These models allow to remove the noise using the extraction of the main motion features and constitute a stable base to identify the evolutions of human activity. Each modality is used to recognize a human action using wavelets of derived beta distributions approach. The proposed approach has been validated successfully on a subset of KTH and UCF sports database.

Keywords: feautures extraction, human action classifier, wavelet neural network, beta wavelet

Procedia PDF Downloads 386
138 Resource Creation Using Natural Language Processing Techniques for Malay Translated Qur'an

Authors: Nor Diana Ahmad, Eric Atwell, Brandon Bennett

Abstract:

Text processing techniques for English have been developed for several decades. But for the Malay language, text processing methods are still far behind. Moreover, there are limited resources, tools for computational linguistic analysis available for the Malay language. Therefore, this research presents the use of natural language processing (NLP) in processing Malay translated Qur’an text. As the result, a new language resource for Malay translated Qur’an was created. This resource will help other researchers to build the necessary processing tools for the Malay language. This research also develops a simple question-answer prototype to demonstrate the use of the Malay Qur’an resource for text processing. This prototype has been developed using Python. The prototype pre-processes the Malay Qur’an and an input query using a stemming algorithm and then searches for occurrences of the query word stem. The result produced shows improved matching likelihood between user query and its answer. A POS-tagging algorithm has also been produced. The stemming and tagging algorithms can be used as tools for research related to other Malay texts and can be used to support applications such as information retrieval, question answering systems, ontology-based search and other text analysis tasks.

Keywords: language resource, Malay translated Qur'an, natural language processing (NLP), text processing

Procedia PDF Downloads 288
137 A General Framework for Knowledge Discovery from Echocardiographic and Natural Images

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, Bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 417
136 Knowledge Management and Tourism: An Exploratory Study Applied to Travel Agents in Egypt

Authors: Mohammad Soliman, Mohamed A. Abou-Shouk

Abstract:

Knowledge management focuses on the development, storage, retrieval, and dissemination of information and expertise. It has become an important tool to improve performance in tourism enterprises. This includes improving decision-making, developing customer services, and increasing sales and profits. Knowledge management adoption depends on human, organizational and technological factors. This study aims to explore the concept of knowledge management in travel agents in Egypt. It explores the requirements of adoption and its impact on performance in these agencies. The study targets Category A travel agents in Egypt. The population of the study encompasses Category A travel agents having online presence. An online questionnaire is used to collect data from managers of travel agents. This study is useful for travel agents who are in urgent need to restructure their intermediary role and support their survival in the global travel market. The study sheds light on the requirements of adoption and the expected impact on performance. This could help travel agents identify their situation and the determine the extent to which they are ready to adopt knowledge management. This study is contributing to knowledge by providing insights from the tourism sector in a developing country where the concept of knowledge management is still in its infancy stages.

Keywords: knowledge management, knowledge management adoption, performance, travel agents

Procedia PDF Downloads 369
135 Efficient Storage and Intelligent Retrieval of Multimedia Streams Using H. 265

Authors: S. Sarumathi, C. Deepadharani, Garimella Archana, S. Dakshayani, D. Logeshwaran, D. Jayakumar, Vijayarangan Natarajan

Abstract:

The need of the hour for the customers who use a dial-up or a low broadband connection for their internet services is to access HD video data. This can be achieved by developing a new video format using H. 265. This is the latest video codec standard developed by ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) on April 2013. This new standard for video compression has the potential to deliver higher performance than the earlier standards such as H. 264/AVC. In comparison with H. 264, HEVC offers a clearer, higher quality image at half the original bitrate. At this lower bitrate, it is possible to transmit high definition videos using low bandwidth. It doubles the data compression ratio supporting 8K Ultra HD and resolutions up to 8192×4320. In the proposed model, we design a new video format which supports this H. 265 standard. The major areas of applications in the coming future would lead to enhancements in the performance level of digital television like Tata Sky and Sun Direct, BluRay Discs, Mobile Video, Video Conferencing and Internet and Live Video streaming.

Keywords: access HD video, H. 265 video standard, high performance, high quality image, low bandwidth, new video format, video streaming applications

Procedia PDF Downloads 334
134 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts

Authors: William Michael Short

Abstract:

Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.

Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics

Procedia PDF Downloads 102
133 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 236
132 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 391
131 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 122
130 Development of Fuzzy Logic Control Ontology for E-Learning

Authors: Muhammad Sollehhuddin A. Jalil, Mohd Ibrahim Shapiai, Rubiyah Yusof

Abstract:

Nowadays, ontology is common in many areas like artificial intelligence, bioinformatics, e-commerce, education and many more. Ontology is one of the focus areas in the field of Information Retrieval. The purpose of an ontology is to describe a conceptual representation of concepts and their relationships within a particular domain. In other words, ontology provides a common vocabulary for anyone who needs to share information in the domain. There are several ontology domains in various fields including engineering and non-engineering knowledge. However, there are only a few available ontology for engineering knowledge. Fuzzy logic as engineering knowledge is still not available as ontology domain. In general, fuzzy logic requires step-by-step guidelines and instructions of lab experiments. In this study, we presented domain ontology for Fuzzy Logic Control (FLC) knowledge. We give Table of Content (ToC) with middle strategy based on the Uschold and King method to develop FLC ontology. The proposed framework is developed using Protégé as the ontology tool. The Protégé’s ontology reasoner, known as the Pellet reasoner is then used to validate the presented framework. The presented framework offers better performance based on consistency and classification parameter index. In general, this ontology can provide a platform to anyone who needs to understand FLC knowledge.

Keywords: engineering knowledge, fuzzy logic control ontology, ontology development, table of content

Procedia PDF Downloads 275
129 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 370