Search results for: text retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1553

Search results for: text retrieval

1493 Augmented Reality Technology for a User Interface in an Automated Storage and Retrieval System

Authors: Wen-Jye Shyr, Chun-Yuan Chang, Bo-Lin Wei, Chia-Ming Lin

Abstract:

The task of creating an augmented reality technology was described in this study to give operators a user interface that might be a part of an automated storage and retrieval system. Its objective was to give graduate engineering and technology students a system of tools with which to experiment with the creation of augmented reality technologies. To collect and analyze data for maintenance applications, the students used augmented reality technology. Our findings support the evolution of artificial intelligence towards Industry 4.0 practices and the planned Industry 4.0 research stream. Important first insights into the study's effects on student learning were presented.

Keywords: augmented reality, storage and retrieval system, user interface, programmable logic controller

Procedia PDF Downloads 69
1492 In-Fun-Mation: Putting the Fun in Information Retrieval at the Linnaeus University, Sweden

Authors: Aagesson, Ekstrand, Persson, Sallander

Abstract:

A description of how a team of librarians at Linnaeus University Library in Sweden utilizes a pedagogical approach to deliver engaging digital workshops on information retrieval. The team consists of four librarians supporting three different faculties. The paper discusses the challenges faced in engaging students who may perceive information retrieval as a boring and difficult subject. The paper emphasizes the importance of motivation, inclusivity, constructive feedback, and collaborative learning in enhancing student engagement. By employing a two-librarian teaching model, maintaining a lighthearted approach, and relating information retrieval to everyday experiences, the team aimed to create an enjoyable and meaningful learning experience. The authors describe their approach to increase student engagement and learning outcomes through a three-phase workshop structure: before, during, and after the workshops. The "flipped classroom" method was used, where students were provided with pre-workshop materials, including a short film on information search and encouraged to reflect on the topic using a digital collaboration tool. During the workshops, interactive elements such as quizzes, live demonstrations, and practical training were incorporated, along with opportunities for students to ask questions and provide feedback. The paper concludes by highlighting the benefits of the flipped classroom approach and the extended learning opportunities provided by the before and after workshop phases. The authors believe that their approach offers a sustainable alternative for enhancing information retrieval knowledge among students at Linnaeus University.

Keywords: digital workshop, flipped classroom, information retrieval, interactivity, LIS practitioner, student engagement

Procedia PDF Downloads 47
1491 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 361
1490 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach

Authors: B. Ramesh Naik, T. Venugopal

Abstract:

This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.

Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms

Procedia PDF Downloads 162
1489 Audio Information Retrieval in Mobile Environment with Fast Audio Classifier

Authors: Bruno T. Gomes, José A. Menezes, Giordano Cabral

Abstract:

With the popularity of smartphones, mobile apps emerge to meet the diverse needs, however the resources at the disposal are limited, either by the hardware, due to the low computing power, or the software, that does not have the same robustness of desktop environment. For example, in automatic audio classification (AC) tasks, musical information retrieval (MIR) subarea, is required a fast processing and a good success rate. However the mobile platform has limited computing power and the best AC tools are only available for desktop. To solve these problems the fast classifier suits, to mobile environments, the most widespread MIR technologies, seeking a balance in terms of speed and robustness. At the end we found that it is possible to enjoy the best of MIR for mobile environments. This paper presents the results obtained and the difficulties encountered.

Keywords: audio classification, audio extraction, environment mobile, musical information retrieval

Procedia PDF Downloads 526
1488 Text Similarity in Vector Space Models: A Comparative Study

Authors: Omid Shahmirzadi, Adam Lugowski, Kenneth Younge

Abstract:

Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to-patent similarity and compare TFIDF (and related extensions), topic models (e.g., latent semantic indexing), and neural models (e.g., paragraph vectors). Contrary to expectations, the added computational cost of text embedding methods is justified only when: 1) the target text is condensed; and 2) the similarity comparison is trivial. Otherwise, TFIDF performs surprisingly well in other cases: in particular for longer and more technical texts or for making finer-grained distinctions between nearest neighbors. Unexpectedly, extensions to the TFIDF method, such as adding noun phrases or calculating term weights incrementally, were not helpful in our context.

Keywords: big data, patent, text embedding, text similarity, vector space model

Procedia PDF Downloads 154
1487 Human Action Retrieval System Using Features Weight Updating Based Relevance Feedback Approach

Authors: Munaf Rashid

Abstract:

For content-based human action retrieval systems, search accuracy is often inferior because of the following two reasons 1) global information pertaining to videos is totally ignored, only low level motion descriptors are considered as a significant feature to match the similarity between query and database videos, and 2) the semantic gap between the high level user concept and low level visual features. Hence, in this paper, we propose a method that will address these two issues and in doing so, this paper contributes in two ways. Firstly, we introduce a method that uses both global and local information in one framework for an action retrieval task. Secondly, to minimize the semantic gap, a user concept is involved by incorporating features weight updating (FWU) Relevance Feedback (RF) approach. We use statistical characteristics to dynamically update weights of the feature descriptors so that after every RF iteration feature space is modified accordingly. For testing and validation purpose two human action recognition datasets have been utilized, namely Weizmann and UCF. Results show that even with a number of visual challenges the proposed approach performs well.

Keywords: relevance feedback (RF), action retrieval, semantic gap, feature descriptor, codebook

Procedia PDF Downloads 449
1486 Utilization of CD-ROM Database as a Storage and Retrieval System by Students of Nasarawa State University Keffi

Authors: Suleiman Musa

Abstract:

The utilization of CD-ROM as a storage and retrieval system by Nasarawa State University Keffi (NSUK) Library is crucial in preserving and dissemination of information to students and staff. This study investigated the utilization of CD-ROM Database storage and retrieval system by students of NUSK. Data was generated using structure questionnaire. One thousand and fifty two (1052) respondents were randomly selected among post-graduate and under-graduate students. Eight hundred and ten (810) questionnaires were returned, but only five hundred and ninety three (593) questionnaires were well completed and useful. The study found that post-graduate students use CD-ROM Databases more often than the under-graduate students in NSUK. The result of the study revealed that knowledge about CD-ROM Database 33.22% got it through library staff. 29.69% use CD-ROM once a month. Large number of users 45.70% purposely uses CD-ROM Databases for study and research. In fact, lack of users’ orientation amount to 58.35% of problems faced, while 31.20% lack of trained staff make it more difficult for utilization of CD-ROM Database. Major numbers of users 38.28% are neither satisfied nor dissatisfied, while a good number of them 27.99% are satisfied. Then 1.52% is highly dissatisfied but could not give reasons why. However, to ensure effective utilization of CD-ROM Database storage and retrieval system by students of NSUK, the following recommendations are made: effort should be made to encourage under-graduate in using CD-ROM Database. The institution should conduct orientation/induction course for students on CD-ROM Databases in the library. There is need for NSUK to produce in house databases on their CD-ROM for easy access by users.

Keywords: utilization, CD-ROM databases, storage, retrieval, students

Procedia PDF Downloads 426
1485 Structural Analysis of Kamaluddin Behzad's Works Based on Roland Barthes' Theory of Communication, 'Text and Image'

Authors: Mahsa Khani Oushani, Mohammad Kazem Hasanvand

Abstract:

Text and image have always been two important components in Iranian layout. The interactive connection between text and image has shaped the art of book design with multiple patterns. In this research, first the structure and visual elements in the research data were analyzed and then the position of the text element and the image element in relation to each other based on Roland Barthes theory on the three theories of text and image, were studied and analyzed and the results were compared, and interpreted. The purpose of this study is to investigate the pattern of text and image in the works of Kamaluddin Behzad based on three Roland Barthes communication theories, 1. Descriptive communication, 2. Reference communication, 3. Matched communication. The questions of this research are what is the relationship between text and image in Behzad's works? And how is it defined according to Roland Barthes theory? The method of this research has been done with a structuralist approach with a descriptive-analytical method in a library collection method. The information has been collected in the form of documents (library) and is a tool for collecting online databases. Findings show that the dominant element in Behzad's drawings is with the image and has created a reference relationship in the layout of the drawings, but in some cases it achieves a different relationship that despite the preference of the image on the page, the text is dispersed proportionally on the page and plays a more active role, played within the image. The text and the image support each other equally on the page; Roland Barthes equates this connection.

Keywords: text, image, Kamaluddin Behzad, Roland Barthes, communication theory

Procedia PDF Downloads 175
1484 Plagiarism Detection for Flowchart and Figures in Texts

Authors: Ahmadu Maidorawa, Idrissa Djibo, Muhammad Tella

Abstract:

This paper presents a method for detecting flow chart and figure plagiarism based on shape of image processing and multimedia retrieval. The method managed to retrieve flowcharts with ranked similarity according to different matching sets. Plagiarism detection is well known phenomenon in the academic arena. Copying other people is considered as serious offense that needs to be checked. There are many plagiarism detection systems such as turn-it-in that has been developed to provide these checks. Most, if not all, discard the figures and charts before checking for plagiarism. Discarding the figures and charts result in look holes that people can take advantage. That means people can plagiarize figures and charts easily without the current plagiarism systems detecting it. There are very few papers which talks about flowcharts plagiarism detection. Therefore, there is a need to develop a system that will detect plagiarism in figures and charts.

Keywords: flowchart, multimedia retrieval, figures similarity, image comparison, figure retrieval

Procedia PDF Downloads 444
1483 Semantic Indexing Improvement for Textual Documents: Contribution of Classification by Fuzzy Association Rules

Authors: Mohsen Maraoui

Abstract:

In the aim of natural language processing applications improvement, such as information retrieval, machine translation, lexical disambiguation, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource Euro WordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules model (in attempt to discover the non-taxonomic relations or contextual relations between the concepts of a document). These relations are latent relations buried in the text and carried by the semantic context of the co-occurrence of concepts in the document. Our proposed indexing approach can be applied to text documents in various languages because it is based on a linguistic method adapted to the language through a multilingual thesaurus. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.

Keywords: concept extraction, conceptual network formalism, fuzzy association rules, multilingual thesaurus, semantic indexing

Procedia PDF Downloads 125
1482 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 319
1481 Towards Learning Query Expansion

Authors: Ahlem Bouziri, Chiraz Latiri, Eric Gaussier

Abstract:

The steady growth in the size of textual document collections is a key progress-driver for modern information retrieval techniques whose effectiveness and efficiency are constantly challenged. Given a user query, the number of retrieved documents can be overwhelmingly large, hampering their efficient exploitation by the user. In addition, retaining only relevant documents in a query answer is of paramount importance for an effective meeting of the user needs. In this situation, the query expansion technique offers an interesting solution for obtaining a complete answer while preserving the quality of retained documents. This mainly relies on an accurate choice of the added terms to an initial query. Interestingly enough, query expansion takes advantage of large text volumes by extracting statistical information about index terms co-occurrences and using it to make user queries better fit the real information needs. In this respect, a promising track consists in the application of data mining methods to extract dependencies between terms, namely a generic basis of association rules between terms. The key feature of our approach is a better trade off between the size of the mining result and the conveyed knowledge. Thus, face to the huge number of derived association rules and in order to select the optimal combination of query terms from the generic basis, we propose to model the problem as a classification problem and solve it using a supervised learning algorithm such as SVM or k-means. For this purpose, we first generate a training set using a genetic algorithm based approach that explores the association rules space in order to find an optimal set of expansion terms, improving the MAP of the search results. The experiments were performed on SDA 95 collection, a data collection for information retrieval. It was found that the results were better in both terms of MAP and NDCG. The main observation is that the hybridization of text mining techniques and query expansion in an intelligent way allows us to incorporate the good features of all of them. As this is a preliminary attempt in this direction, there is a large scope for enhancing the proposed method.

Keywords: supervised leaning, classification, query expansion, association rules

Procedia PDF Downloads 308
1480 Intertextuality in Choreography: Investigation of Text and Movements in Making Choreography

Authors: Muhammad Fairul Azreen Mohd Zahid

Abstract:

Speech, text, and movement intensify aspects of creating choreography by connecting with emotional entanglements, tradition, literature, and other texts. This research focuses on the practice as research that will prioritise the choreography process as an inquiry approach. With the driven context, the study intervenes in critical conjunctions of choreographic theory, bringing together new reflections on the moving body, spaces of action, as well as intertextuality between text and movements in making choreography. Throughout the process, the researcher will introduce the level of deliberation from speech through movements and text to express emotion within a narrative context of an “illocutionary act.” This practice as research will produce a different meaning from the “utterance text” to “utterance movements” in the perspective of speech acts theory by J.L Austin based on fragmented text from “pidato adat” which has been used as opening speech in Randai. Looking at the theory of deconstruction by Jacque Derrida also will give a different meaning from the text. Nevertheless, the process of creating the choreography will also help to lay the basic normative structure implicit in “constative” (statement text/movement) and “performative” (command text/movement). Through this process, the researcher will also look at several methods of using text from two works by Joseph Gonzales, “Becoming King-The Pakyung Revisited” and Crystal Pite's “The Statement,” as references to produce different methods in making choreography. The perspective from the semiotic foundation will support how occurrences within dance discourses as texts through a semiotic lens. The method used in this research is qualitative, which includes an interview and simulation of the concept to get an outcome.

Keywords: intertextuality, choreography, speech act, performative, deconstruction

Procedia PDF Downloads 79
1479 Written Argumentative Texts in Elementary School: The Development of Text Structure and Its Relation to Reading Comprehension

Authors: Sara Zadunaisky Ehrlich, Batia Seroussi, Anat Stavans

Abstract:

Text structure is a parameter of text quality. This study investigated the structure of written argumentative texts produced by elementary school age children. We set two objectives: to identify and trace the structural components of the argumentative texts and to investigate whether reading comprehension skills were correlated with text structure. 293 school children from 2nd to 5th grades were asked to write two argumentative texts about informal or everyday life controversial topics and completed two reading tasks that targeted different levels of text comprehension. The findings indicated, on the one hand, significant developmental differences between mature and more novice writers in terms of text length and mean proportion of clauses produced for a better elaboration of the different text components. On the other hand, with certain fluctuations, no meaningful differences were found in terms of presence of text structure: at all grade levels, elementary school children produced the basic and minimal structure that included the writer's argument and reasons or arguments' supports. Counter-arguments were scarce even in the upper grades. While the children captured that essentially an argument must be justified, the more the number of supports produced, the fewer the clauses the children produced. Last, weak to mild relations were found between reading comprehension and argumentative text structure. Nevertheless, children who scored higher on sophisticated questions that require inferential or world knowledge displayed more elaborated structures in terms of text length and size of supports to the writer's argument. These findings indicate how school-age children perceive the basic template of an argument with future implications regarding how to elaborate written arguments.

Keywords: argumentative text, text structure, elementary school children, written argumentations

Procedia PDF Downloads 150
1478 The Morphology of Sri Lankan Text Messages

Authors: Chamindi Dilkushi Senaratne

Abstract:

Communicating via a text or an SMS (Short Message Service) has become an integral part of our daily lives. With the increase in the use of mobile phones, text messaging has become a genre by itself worth researching and studying. It is undoubtedly a major phenomenon revealing language change. This paper attempts to describe the morphological processes of text language of urban bilinguals in Sri Lanka. It will be a typological study based on 500 English text messages collected from urban bilinguals residing in Colombo. The messages are selected by categorizing the deviant forms of language use apparent in text messages. These stylistic deviations are a deliberate skilled performance by the users of the language possessing an in-depth knowledge of linguistic systems to create new words and thereby convey their linguistic identity and individual and group solidarity via the message. The findings of the study solidifies arguments that the manipulation of language in text messages is both creative and appropriate. In addition, code mixing theories will be used to identify how existing morphological processes are adapted by bilingual users in Sri Lanka when texting. The study will reveal processes such as omission, initialism, insertion and alternation in addition to other identified linguistic features in text language. The corpus reveals the most common morphological processes used by Sri Lankan urban bilinguals when sending texts.

Keywords: bilingual, deviations, morphology, texts

Procedia PDF Downloads 251
1477 “Octopub”: Geographical Sentiment Analysis Using Named Entity Recognition from Social Networks for Geo-Targeted Billboard Advertising

Authors: Oussama Hafferssas, Hiba Benyahia, Amina Madani, Nassima Zeriri

Abstract:

Although data nowadays has multiple forms; from text to images, and from audio to videos, yet text is still the most used one at a public level. At an academical and research level, and unlike other forms, text can be considered as the easiest form to process. Therefore, a brunch of Data Mining researches has been always under its shadow, called "Text Mining". Its concept is just like data mining’s, finding valuable patterns in data, from large collections and tremendous volumes of data, in this case: Text. Named entity recognition (NER) is one of Text Mining’s disciplines, it aims to extract and classify references such as proper names, locations, expressions of time and dates, organizations and more in a given text. Our approach "Octopub" does not aim to find new ways to improve named entity recognition process, rather than that it’s about finding a new, and yet smart way, to use NER in a way that we can extract sentiments of millions of people using Social Networks as a limitless information source, and Marketing for product promotion as the main domain of application.

Keywords: textmining, named entity recognition(NER), sentiment analysis, social media networks (SN, SMN), business intelligence(BI), marketing

Procedia PDF Downloads 565
1476 Video Text Information Detection and Localization in Lecture Videos Using Moments

Authors: Belkacem Soundes, Guezouli Larbi

Abstract:

This paper presents a robust and accurate method for text detection and localization over lecture videos. Frame regions are classified into text or background based on visual feature analysis. However, lecture video shows significant degradation mainly related to acquisition conditions, camera motion and environmental changes resulting in low quality videos. Hence, affecting feature extraction and description efficiency. Moreover, traditional text detection methods cannot be directly applied to lecture videos. Therefore, robust feature extraction methods dedicated to this specific video genre are required for robust and accurate text detection and extraction. Method consists of a three-step process: Slide region detection and segmentation; Feature extraction and non-text filtering. For robust and effective features extraction moment functions are used. Two distinct types of moments are used: orthogonal and non-orthogonal. For orthogonal Zernike Moments, both Pseudo Zernike moments are used, whereas for non-orthogonal ones Hu moments are used. Expressivity and description efficiency are given and discussed. Proposed approach shows that in general, orthogonal moments show high accuracy in comparison to the non-orthogonal one. Pseudo Zernike moments are more effective than Zernike with better computation time.

Keywords: text detection, text localization, lecture videos, pseudo zernike moments

Procedia PDF Downloads 133
1475 Intonation Salience as an Underframe to Text Intonation Models

Authors: Tatiana Stanchuliak

Abstract:

It is common knowledge that intonation is not laid over a ready text. On the contrary, intonation forms and accompanies the text on the level of its birth in the speaker’s mind. As a result, intonation plays one of the fundamental roles in the process of transferring a thought into external speech. Intonation structure can highlight the semantic significance of textual elements and become a ranging mark in understanding the information structure of the text. Intonation functions by means of prosodic characteristics, one of which is intonation salience, whose function in texts results in making some textual elements more prominent than others. This function of intonation, therefore, performs as organizing. It helps to form the frame of key elements of the text. The study under consideration made an attempt to look into the inner nature of salience and create a sort of a text intonation model. This general goal brought to some more specific intermediate results. First, there were established degrees of salience on the level of the smallest semantic element - intonation group, as well as prosodic means of creating salience, were examined. Second, the most frequent combinations of prosodic means made it possible to distinguish patterns of salience, which then became constituent elements of a text intonation model. Third, the analysis of the predicate structure allowed to divide the whole text into smaller parts, or units, which performed a specific function in the developing of the general communicative intention. It appeared that such units can be found in any text and they have common characteristics of their intonation arrangement. These findings are certainly very important both for the theory of intonation and their practical application.

Keywords: accentuation , inner speech, intention, intonation, intonation functions, models, patterns, predicate, salience, semantics, sentence stress, text

Procedia PDF Downloads 246
1474 Between AACR2 and RDA What Changes Occurs in Them

Authors: Ibrahim Abdullahi Mohammad

Abstract:

A library catalogue exists not only as an inventory of the collections of the particular library, but also as a retrieval device. It is provided to assist the library user in finding whatever information or information resources they may be looking for. The paper proposes that this location objective of the library catalogue can only be fulfilled, if the library catalogue is constructed, bearing in mind the information needs and searching behavior of the library user. Comparing AACR2 and RDA viz-a-viz the changes RDA has introduced into bibliographic standards, the paper tries to establish the level of viability of RDA in relation to AACR2.

Keywords: library catalogue, information retrieval, AACR2, RDA

Procedia PDF Downloads 32
1473 Distorted Document Images Dataset for Text Detection and Recognition

Authors: Ilia Zharikov, Philipp Nikitin, Ilia Vasiliev, Vladimir Dokholyan

Abstract:

With the increasing popularity of document analysis and recognition systems, text detection (TD) and optical character recognition (OCR) in document images become challenging tasks. However, according to our best knowledge, no publicly available datasets for these particular problems exist. In this paper, we introduce a Distorted Document Images dataset (DDI-100) and provide a detailed analysis of the DDI-100 in its current state. To create the dataset we collected 7000 unique document pages, and extend it by applying different types of distortions and geometric transformations. In total, DDI-100 contains more than 100,000 document images together with binary text masks, text and character locations in terms of bounding boxes. We also present an analysis of several state-of-the-art TD and OCR approaches on the presented dataset. Lastly, we demonstrate the usefulness of DDI-100 to improve accuracy and stability of the considered TD and OCR models.

Keywords: document analysis, open dataset, optical character recognition, text detection

Procedia PDF Downloads 150
1472 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 21
1471 Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification

Authors: Bharatendra Rai

Abstract:

The sequence of words in text data has long-term dependencies and is known to suffer from vanishing gradient problems when developing deep learning models. Although recurrent networks such as long short-term memory networks help to overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine the advantages of long short-term memory networks and convolutional neural networks can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning.

Keywords: long short-term memory networks, convolutional recurrent networks, text classification, hyperparameter tuning, Tukey honest significant differences

Procedia PDF Downloads 102
1470 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 65
1469 Design and Implementation of Flexible Metadata Editing System for Digital Contents

Authors: K. W. Nam, B. J. Kim, S. J. Lee

Abstract:

Along with the development of network infrastructures, such as high-speed Internet and mobile environment, the explosion of multimedia data is expanding the range of multimedia services beyond voice and data services. Amid this flow, research is actively being done on the creation, management, and transmission of metadata on digital content to provide different services to users. This paper proposes a system for the insertion, storage, and retrieval of metadata about digital content. The metadata server with Binary XML was implemented for efficient storage space and retrieval speeds, and the transport data size required for metadata retrieval was simplified. With the proposed system, the metadata could be inserted into the moving objects in the video, and the unnecessary overlap could be minimized by improving the storage structure of the metadata. The proposed system can assemble metadata into one relevant topic, even if it is expressed in different media or in different forms. It is expected that the proposed system will handle complex network types of data.

Keywords: video, multimedia, metadata, editing tool, XML

Procedia PDF Downloads 153
1468 Towards Logical Inference for the Arabic Question-Answering

Authors: Wided Bakari, Patrice Bellot, Omar Trigui, Mahmoud Neji

Abstract:

This article constitutes an opening to think of the modeling and analysis of Arabic texts in the context of a question-answer system. It is a question of exceeding the traditional approaches focused on morphosyntactic approaches. Furthermore, we present a new approach that analyze a text in order to extract correct answers then transform it to logical predicates. In addition, we would like to represent different levels of information within a text to answer a question and choose an answer among several proposed. To do so, we transform both the question and the text into logical forms. Then, we try to recognize all entailment between them. The results of recognizing the entailment are a set of text sentences that can implicate the user’s question. Our work is now concentrated on an implementation step in order to develop a system of question-answering in Arabic using techniques to recognize textual implications. In this context, the extraction of text features (keywords, named entities, and relationships that link them) is actually considered the first step in our process of text modeling. The second one is the use of techniques of textual implication that relies on the notion of inference and logic representation to extract candidate answers. The last step is the extraction and selection of the desired answer.

Keywords: NLP, Arabic language, question-answering, recognition text entailment, logic forms

Procedia PDF Downloads 319
1467 Critical Review of Web Content Mining Extraction Mechanisms

Authors: Rabia Bashir, Sajjad Akbar

Abstract:

There is an inevitable demand of web mining due to rapid increase of huge information on the Internet, but the striking variety of web structures has made required content retrieval a difficult task. To counter this issue, Web Content Mining (WCM) emerges as a potential candidate which extracts and integrates suitable resources of data to users. In past few years, research has been done on several extraction techniques for WCM i.e. agent-based, template-based, assumption-based, statistic-based, wrapper-based and machine learning. However, it is still unclear that either these approaches are efficiently tackling the significant challenges of WCM or not. To answer this question, this paper identifies these challenges such as language independency, structure flexibility, performance, automation, dynamicity, redundancy handling, intelligence, relevant content retrieval, and privacy. Further, mapping of these challenges is done with existing extraction mechanisms which helps to adopt the most suitable WCM approach, given some conditions and characteristics at hand.

Keywords: content mining challenges, web content mining, web content extraction approaches, web information retrieval

Procedia PDF Downloads 528
1466 The Composer’s Hand: An Analysis of Arvo Pärt’s String Orchestral Work, Psalom

Authors: Mark K. Johnson

Abstract:

Arvo Pärt has composed over 80 text-based compositions based on nine different languages. But prior to 2015, it was not publicly known what texts the composer used in composing a number of his non-vocal works, nor the language of those texts. Because of this lack of information, few if any musical scholars have illustrated in any detail how textual structure applies to any of Pärt’s instrumental compositions. However, in early 2015, the Arvo Pärt Centre in Estonia published In Principio, a compendium of the texts Pärt has used to derive many of the parameters of his text-based compositions. This paper provides the first detailed analysis of the relationship between structural aspects of the Church Slavonic Eastern Orthodox text of Psalm 112 and the musical parameters that Pärt used when composing the string orchestral work Psalom. It demonstrates that Pärt’s text-based compositions are carefully crafted works, and that evidence of the presence of the ‘invisible’ hand of the composer can be found within every aspect of the underpinning structures, at the more elaborate middle ground level, and even within surface aspects of these works. Based on the analysis of Psalom, it is evident that the text Pärt selected for Psalom informed many of his decisions regarding the musical structures, parameters and processes that he deployed in composing this non-vocal text-based work. Many of these composerly decisions in relation to these various aspects cannot be fathomed without access to, and an understanding of, the text associated with the work.

Keywords: Arvo Pärt, minimalism, psalom, text-based process music

Procedia PDF Downloads 212
1465 N400 Investigation of Semantic Priming Effect to Symbolic Pictures in Text

Authors: Thomas Ousterhout

Abstract:

The purpose of this study was to investigate if incorporating meaningful pictures of gestures and facial expressions in short sentences of text could supplement the text with enough semantic information to produce and N400 effect when probe words incongruent to the picture were subsequently presented. Event-related potentials (ERPs) were recorded from a 14-channel commercial grade EEG headset while subjects performed congruent/incongruent reaction time discrimination tasks. Since pictures of meaningful gestures have been shown to be semantically processed in the brain in a similar manner as words are, it is believed that pictures will add supplementary information to text just as the inclusion of their equivalent synonymous word would. The hypothesis is that when subjects read the text/picture mixed sentences, they will process the images and words just like in face-to-face communication and therefore probe words incongruent to the image will produce an N400.

Keywords: EEG, ERP, N400, semantics, congruency, facilitation, Emotiv

Procedia PDF Downloads 242
1464 Electroencephalogram during Natural Reading: Theta and Alpha Rhythms as Analytical Tools for Assessing a Reader’s Cognitive State

Authors: D. Zhigulskaya, V. Anisimov, A. Pikunov, K. Babanova, S. Zuev, A. Latyshkova, K. Сhernozatonskiy, A. Revazov

Abstract:

Electrophysiology of information processing in reading is certainly a popular research topic. Natural reading, however, has been relatively poorly studied, despite having broad potential applications for learning and education. In the current study, we explore the relationship between text categories and spontaneous electroencephalogram (EEG) while reading. Thirty healthy volunteers (mean age 26,68 ± 1,84) participated in this study. 15 Russian-language texts were used as stimuli. The first text was used for practice and was excluded from the final analysis. The remaining 14 were opposite pairs of texts in one of 7 categories, the most important of which were: interesting/boring, fiction/non-fiction, free reading/reading with an instruction, reading a text/reading a pseudo text (consisting of strings of letters that formed meaningless words). Participants had to read the texts sequentially on an Apple iPad Pro. EEG was recorded from 12 electrodes simultaneously with eye movement data via ARKit Technology by Apple. EEG spectral amplitude was analyzed in Fz for theta-band (4-8 Hz) and in C3, C4, P3, and P4 for alpha-band (8-14 Hz) using the Friedman test. We found that reading an interesting text was accompanied by an increase in theta spectral amplitude in Fz compared to reading a boring text (3,87 µV ± 0,12 and 3,67 µV ± 0,11, respectively). When instructions are given for reading, we see less alpha activity than during free reading of the same text (3,34 µV ± 0,20 and 3,73 µV ± 0,28, respectively, for C4 as the most representative channel). The non-fiction text elicited less activity in the alpha band (C4: 3,60 µV ± 0,25) than the fiction text (C4: 3,66 µV ± 0,26). A significant difference in alpha spectral amplitude was also observed between the regular text (C4: 3,64 µV ± 0,29) and the pseudo text (C4: 3,38 µV ± 0,22). These results suggest that some brain activity we see on EEG is sensitive to particular features of the text. We propose that changes in theta and alpha bands during reading may serve as electrophysiological tools for assessing the reader’s cognitive state as well as his or her attitude to the text and the perceived information. These physiological markers have prospective practical value for developing technological solutions and biofeedback systems for reading in particular and for education in general.

Keywords: EEG, natural reading, reader's cognitive state, theta-rhythm, alpha-rhythm

Procedia PDF Downloads 66