Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 72

Search results for: Bangla-UNL Dictionary

72 The Efficiency of the Use of Medical Bilingual Dictionary in English Language Teaching in Vocational College

Authors: Zorana Jurinjak, Christos Alexopoulos

Abstract:

The aim of this paper is to examine the effectiveness of using a medical bilingual dictionary in teaching English in a vocational college. More precisely, to what extent the use of bilingual medical dictionary in relation to the use of Standard English bilingual dictionaries influences the results on tests, and thus the acquisition of better competence of students mastering the subject terminology. Secondary interest in this paper would be to raise awareness among students and teachers about the advantages of dictionary use. The experiment was conducted at College of Applied Health Sciences in Ćuprija on a sample of 90 students. The respondents translated three medical texts with 42 target terms. Statistical analyses of the data obtained show that the differences in average time and correct answers favor the students who used medical dictionary.

Keywords: bilingual medical dictionary, standard english bilingual dictionary, medical terminology, EOS, ESP

Procedia PDF Downloads 40
71 The Grammatical Dictionary Compiler: A System for Kartvelian Languages

Authors: Liana Lortkipanidze, Nino Amirezashvili, Nino Javashvili

Abstract:

The purpose of the grammatical dictionary is to provide information on the morphological and syntactic characteristics of the basic word in the dictionary entry. The electronic grammatical dictionaries are used as a tool of automated morphological analysis for texts processing. The Georgian Grammatical Dictionary should contain grammatical information for each word: part of speech, type of declension/conjugation, grammatical forms of the word (paradigm), alternative variants of basic word/lemma. In this paper, we present the system for compiling the Georgian Grammatical Dictionary automatically. We propose dictionary-based methods for extending grammatical lexicons. The input lexicon contains only a few number of words with identical grammatical features. The extension is based on similarity measures between features of words; more precisely, we add words to the extended lexicons, which are similar to those, which are already in the grammatical dictionary. Our dictionaries are corpora-based, and for the compiling, we introduce the method for lemmatization of unknown words, i.e., words of which neither full form nor lemma is in the grammatical dictionary.

Keywords: acquisition of lexicon, Georgian grammatical dictionary, lemmatization rules, morphological processor

Procedia PDF Downloads 69
70 The Analysis of Indian Culture through the Lexicographical Discourse of Hindi-French Dictionary

Authors: Tanzil Ansari

Abstract:

A dictionary is often considered as a list of words, arranged in alphabetical orders, providing information on a language or languages and it informs us about the spelling, the pronunciation, the origin, the gender and the grammatical functions of new and unknown words. In other words, it is first and foremost a linguistic tool. But, the research across the world in the field of linguistic and lexicography proved that a dictionary is not only a linguistic tool but also a cultural product through which a lexicographer transmits the culture of a country or a linguistic community from his or her ideology. It means, a dictionary does not present only language and its metalinguistic functions but also its culture. Every language consists of some words and expressions which depict the culture of its language. In this way, it is impossible to disassociate language from its culture. There is always an ideology that plays an important role in the depiction of any culture. Using the orientalism theory of Edward Said to represent the east, the objective of the present research is to study the representation of Indian culture through the lexicographical discourse of Hindi-French Dictionary of Federica Boschetti, a French lexicographer. The results show that the Indian culture is stereotypical and monolithic. It also shows India as male oriented country where women are exploited by male-dominated society. The study is focused on Hindi-French dictionary, but its line of argument can be compared to dictionaries produced in other languages.

Keywords: culture, dictionary, lexicographical discourse, stereotype image

Procedia PDF Downloads 228
69 An Online Corpus-Based Bilingual Collocations Dictionary for Second/Foreign Language Learners

Authors: Adriane Orenha-Ottaiano

Abstract:

Collocations are conventionalized, recurrent and arbitrary lexical combinations. Due to the fact that they are highly specific for a particular language and may be contextually restricted, collocations pose a problem to EFL/ESL learners with regard to production or encoding. Taking that into account, the compilation of monolingual and bilingual collocations dictionaries for the referred audience is highly crucial and significant. Thus, the aim of this paper is to discuss the importance of the compilation of an Online Corpus-based Bilingual Collocations Dictionary, in the English-Portuguese and Portuguese-English directions. On a first phase, with the use of WordSmith Tools, the collocations were extracted from a Translation Learner Corpus (TLC), a parallel corpus made up of university students’ translations in the Portuguese-English direction, with approximately 100,000 words. In a second stage, based on the keywords analyzed from the TLC, more collocational patterns were extracted using the Sketch Engine. In order to include more collocations as well as to ensure dictionary users will have access to more frequent and recurrent collocations, we also use the frequency list from The Corpus of Contemporary American English, with the purpose of extracting more patterns. The dictionary focuses on all types of collocations (verbal, noun, adjectival and adverbial collocations), in order to help the referred audience use them more accurately and productively – so far the dictionary has more than 330 entries, and more than 3,500 collocations extracted. The idea of having the proposed dictionary in online format may allow to incorporate more qualitatively and quantitatively collocational information. Besides, more examples may be included, different from conventional printed collocations dictionaries. Being the first bilingual collocations dictionary in the aforementioned directions, it is hoped to achieve the challenge of meeting learners’ collocational needs as the collocations have been selected according to learners’ difficulties regarding the use of collocations.

Keywords: Corpus-Based Collocations Dictionary, Collocations , Bilingual Collocations Dictionary, Collocational Patterns

Procedia PDF Downloads 244
68 Music Note Detection and Dictionary Generation from Music Sheet Using Image Processing Techniques

Authors: Muhammad Ammar, Talha Ali, Abdul Basit, Bakhtawar Rajput, Zobia Sohail

Abstract:

Music note detection is an area of study for the past few years and has its own influence in music file generation from sheet music. We proposed a method to detect music notes on sheet music using basic thresholding and blob detection. Subsequently, we created a notes dictionary using a semi-supervised learning approach. After notes detection, for each test image, the new symbols are added to the dictionary. This makes the notes detection semi-automatic. The experiments are done on images from a dataset and also on the captured images. The developed approach showed almost 100% accuracy on the dataset images, whereas varying results have been seen on captured images.

Keywords: music note, sheet music, optical music recognition, blob detection, thresholding, dictionary generation

Procedia PDF Downloads 70
67 Effects of Computer-Mediated Dictionaries on Reading Comprehension and Vocabulary Acquisition

Authors: Mohamed Amin Mekheimer

Abstract:

This study aimed to investigate the effects of paper-based monolingual, pop-up and type-in electronic dictionaries on improving reading comprehension and incidental vocabulary acquisition and retention in an EFL context. It tapped into how computer-mediated dictionaries may have facilitated/impeded reading comprehension and vocabulary acquisition. Findings showed differential effects produced by the three treatments compared with the control group. Specifically, it revealed that the pop-up dictionary condition had the shortest average vocabulary searching time, vocabulary and text reading time, yet with less than the type-in dictionary group but more than the book dictionary group in terms of frequent dictionary 'look-ups' (p<.0001). In addition, ANOVA analyses also showed that text reading time differed significantly across all four treatments, and so did reading comprehension. Vocabulary acquisition was reported as enhanced in the three treatments rather than in the control group, but still with insignificant differences across the three treatments, yet with more differential effects in favour of the pop-up condition. Data also assert that participants preferred the pop-up e-dictionary more than the type-in and paper-based groups. Explanations of the findings vis-à-vis the cognitive load theory were presented. Pedagogical implications and suggestions for further research were forwarded at the end.

Keywords: computer-mediated dictionaries, type-in dictionaries, pop-up dictionaries, reading comprehension, vocabulary acquisition

Procedia PDF Downloads 368
66 Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning

Authors: Fuad Noman, Sh-Hussain Salleh, Chee-Ming Ting, Hadri Hussain, Syed Rasul

Abstract:

In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.

Keywords: electrocardiogram, dictionary learning, sparse coding, classification

Procedia PDF Downloads 295
65 Project Marayum: Creating a Community Built Mobile Phone Based, Online Web Dictionary for Endangered Philippine Languages

Authors: Samantha Jade Sadural, Kathleen Gay Figueroa, Noel Nicanor Sison II, Francis Miguel Quilab, Samuel Edric Solis, Kiel Gonzales, Alain Andrew Boquiren, Janelle Tan, Mario Carreon

Abstract:

Of the 185 languages in the Philippines, 28 are endangered, 11 are dying off, and 4 are extinct. Language documentation, as a prerequisite to language education, can be one of the ways languages can be preserved. Project Marayum is envisioned to be a collaboratively built, mobile phone-based, online dictionary platform for Philippine languages. Although there are many online language dictionaries available on the Internet, Project Marayum aims to give a sense of ownership to the language community's dictionary as it is built and maintained by the community for the community. From a seed dictionary, members of a language community can suggest changes, add new entries, and provide language examples. Going beyond word definitions, the platform can be used to gather sample sentences and even audio samples of word usage. These changes are reviewed by language experts of the community, sourced from the local state universities or local government units. Approved changes are then added to the dictionary and can be viewed instantly through the Marayum website. A companion mobile phone application allows users to browse the dictionary in remote areas where Internet connectivity is nonexistent. The dictionary will automatically be updated once the user regains Internet access. Project Marayum is still a work in progress. At the time of this abstract's writing, the Project has just entered its second year. Prototypes are currently being tested with the Asi language of Romblon island as its initial language testbed. In October 2020, Project Marayum will have both a webpage and mobile application with Asi, Ilocano, and Cebuano language dictionaries available for use online or for download. In addition, the Marayum platform would be then easily expandable for use of the more endangered language communities. Project Marayum is funded by the Philippines Department of Science and Technology.

Keywords: collaborative language dictionary, community-centered lexicography, content management system, software engineering

Procedia PDF Downloads 55
64 An Image Segmentation Algorithm for Gradient Target Based on Mean-Shift and Dictionary Learning

Authors: Yanwen Li, Shuguo Xie

Abstract:

In electromagnetic imaging, because of the diffraction limited system, the pixel values could change slowly near the edge of the image targets and they also change with the location in the same target. Using traditional digital image segmentation methods to segment electromagnetic gradient images could result in lots of errors because of this change in pixel values. To address this issue, this paper proposes a novel image segmentation and extraction algorithm based on Mean-Shift and dictionary learning. Firstly, the preliminary segmentation results from adaptive bandwidth Mean-Shift algorithm are expanded, merged and extracted. Then the overlap rate of the extracted image block is detected before determining a segmentation region with a single complete target. Last, the gradient edge of the extracted targets is recovered and reconstructed by using a dictionary-learning algorithm, while the final segmentation results are obtained which are very close to the gradient target in the original image. Both the experimental results and the simulated results show that the segmentation results are very accurate. The Dice coefficients are improved by 70% to 80% compared with the Mean-Shift only method.

Keywords: gradient image, segmentation and extract, mean-shift algorithm, dictionary iearning

Procedia PDF Downloads 193
63 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System

Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur

Abstract:

Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.

Keywords: avatar, dictionary, HamNoSys, hearing impaired, Indian sign language (ISL), sign language

Procedia PDF Downloads 150
62 Sparse Representation Based Spatiotemporal Fusion Employing Additional Image Pairs to Improve Dictionary Training

Authors: Dacheng Li, Bo Huang, Qinjin Han, Ming Li

Abstract:

Remotely sensed imagery with the high spatial and temporal characteristics, which it is hard to acquire under the current land observation satellites, has been considered as a key factor for monitoring environmental changes over both global and local scales. On a basis of the limited high spatial-resolution observations, challenged studies called spatiotemporal fusion have been developed for generating high spatiotemporal images through employing other auxiliary low spatial-resolution data while with high-frequency observations. However, a majority of spatiotemporal fusion approaches yield to satisfactory assumption, empirical but unstable parameters, low accuracy or inefficient performance. Although the spatiotemporal fusion methodology via sparse representation theory has advantage in capturing reflectance changes, stability and execution efficiency (even more efficient when overcomplete dictionaries have been pre-trained), the retrieval of high-accuracy dictionary and its response to fusion results are still pending issues. In this paper, we employ additional image pairs (here each image-pair includes a Landsat Operational Land Imager and a Moderate Resolution Imaging Spectroradiometer acquisitions covering the partial area of Baotou, China) only into the coupled dictionary training process based on K-SVD (K-means Singular Value Decomposition) algorithm, and attempt to improve the fusion results of two existing sparse representation based fusion models (respectively utilizing one and two available image-pair). The results show that more eligible image pairs are probably related to a more accurate overcomplete dictionary, which generally indicates a better image representation, and is then contribute to an effective fusion performance in case that the added image-pair has similar seasonal aspects and image spatial structure features to the original image-pair. It is, therefore, reasonable to construct multi-dictionary training pattern for generating a series of high spatial resolution images based on limited acquisitions.

Keywords: spatiotemporal fusion, sparse representation, K-SVD algorithm, dictionary learning

Procedia PDF Downloads 193
61 Design and Implementation of Image Super-Resolution for Myocardial Image

Authors: M. V. Chidananda Murthy, M. Z. Kurian, H. S. Guruprasad

Abstract:

Super-resolution is the technique of intelligently upscaling images, avoiding artifacts or blurring, and deals with the recovery of a high-resolution image from one or more low-resolution images. Single-image super-resolution is a process of obtaining a high-resolution image from a set of low-resolution observations by signal processing. While super-resolution has been demonstrated to improve image quality in scaled down images in the image domain, its effects on the Fourier-based technique remains unknown. Super-resolution substantially improved the spatial resolution of the patient LGE images by sharpening the edges of the heart and the scar. This paper aims at investigating the effects of single image super-resolution on Fourier-based and image based methods of scale-up. In this paper, first, generate a training phase of the low-resolution image and high-resolution image to obtain dictionary. In the test phase, first, generate a patch and then difference of high-resolution image and interpolation image from the low-resolution image. Next simulation of the image is obtained by applying convolution method to the dictionary creation image and patch extracted the image. Finally, super-resolution image is obtained by combining the fused image and difference of high-resolution and interpolated image. Super-resolution reduces image errors and improves the image quality.

Keywords: image dictionary creation, image super-resolution, LGE images, patch extraction

Procedia PDF Downloads 247
60 The Automatisation of Dictionary-Based Annotation in a Parallel Corpus of Old English

Authors: Ana Elvira Ojanguren Lopez, Javier Martin Arista

Abstract:

The aims of this paper are to present the automatisation procedure adopted in the implementation of a parallel corpus of Old English, as well as, to assess the progress of automatisation with respect to tagging, annotation, and lemmatisation. The corpus consists of an aligned parallel text with word-for-word comparison Old English-English that provides the Old English segment with inflectional form tagging (gloss, lemma, category, and inflection) and lemma annotation (spelling, meaning, inflectional class, paradigm, word-formation and secondary sources). This parallel corpus is intended to fill a gap in the field of Old English, in which no parallel and/or lemmatised corpora are available, while the average amount of corpus annotation is low. With this background, this presentation has two main parts. The first part, which focuses on tagging and annotation, selects the layouts and fields of lexical databases that are relevant for these tasks. Most information used for the annotation of the corpus can be retrieved from the lexical and morphological database Nerthus and the database of secondary sources Freya. These are the sources of linguistic and metalinguistic information that will be used for the annotation of the lemmas of the corpus, including morphological and semantic aspects as well as the references to the secondary sources that deal with the lemmas in question. Although substantially adapted and re-interpreted, the lemmatised part of these databases draws on the standard dictionaries of Old English, including The Student's Dictionary of Anglo-Saxon, An Anglo-Saxon Dictionary, and A Concise Anglo-Saxon Dictionary. The second part of this paper deals with lemmatisation. It presents the lemmatiser Norna, which has been implemented on Filemaker software. It is based on a concordance and an index to the Dictionary of Old English Corpus, which comprises around three thousand texts and three million words. In its present state, the lemmatiser Norna can assign lemma to around 80% of textual forms on an automatic basis, by searching the index and the concordance for prefixes, stems and inflectional endings. The conclusions of this presentation insist on the limits of the automatisation of dictionary-based annotation in a parallel corpus. While the tagging and annotation are largely automatic even at the present stage, the automatisation of alignment is pending for future research. Lemmatisation and morphological tagging are expected to be fully automatic in the near future, once the database of secondary sources Freya and the lemmatiser Norna have been completed.

Keywords: corpus linguistics, historical linguistics, old English, parallel corpus

Procedia PDF Downloads 123
59 Sentiment Classification Using Enhanced Contextual Valence Shifters

Authors: Vo Ngoc Phu, Phan Thi Tuoi

Abstract:

We have explored different methods of improving the accuracy of sentiment classification. The sentiment orientation of a document can be positive (+), negative (-), or neutral (0). We combine five dictionaries from [2, 3, 4, 5, 6] into the new one with 21137 entries. The new dictionary has many verbs, adverbs, phrases and idioms, that are not in five ones before. The paper shows that our proposed method based on the combination of Term-Counting method and Enhanced Contextual Valence Shifters method has improved the accuracy of sentiment classification. The combined method has accuracy 68.984% on the testing dataset, and 69.224% on the training dataset. All of these methods are implemented to classify the reviews based on our new dictionary and the Internet Movie data set.

Keywords: sentiment classification, sentiment orientation, valence shifters, contextual, valence shifters, term counting

Procedia PDF Downloads 436
58 A Comparative Analysis of Vocabulary Learning Strategies among EFL Freshmen and Senior Medical Sciences Students across Different Fields of Study

Authors: M. Hadavi, Z. Hashemi

Abstract:

Learning strategies play an important role in the development of language skills. Vocabulary learning strategies as the backbone of these strategies have become a major part of English language teaching. This study is a comparative analysis of Vocabulary Learning Strategies (VLS) use and preference among freshmen and senior EFL medical sciences students with different fields of study. 449 students (236 freshman and 213 seniors) participated in the study. 64.6% were female and 35.4% were male. The instrument utilized in this research was a questionnaire consisting of 41 items related to the students’ approach to vocabulary learning. The items were classified under eight sections as dictionary strategies, guessing strategies, study preferences, memory strategies, autonomy, note- taking strategies, selective attention, and social strategies. The participants were asked to answer each item with a 5-point Likert-style frequency scale as follows:1) I never or almost never do this, 2) I don’t usually do this, 3) I sometimes do this, 4) I usually do this, and 5)I always or almost always do this. The results indicated that freshmen students and particularly surgical technology students used more strategies compared to the seniors. Overall guessing and dictionary strategies were the most frequently used strategies among all the learners (p=0/000). The mean and standard deviation of using VLS in the students who had no previous history of participating in the private English language classes was less than the students who had attended these type of classes (p=0/000). Female students tended to use social and study preference strategies whereas male students used mostly guessing and dictionary strategies. It can be concluded that the senior students under instruction from the university have learned to rely on themselves and choose the autonomous strategies more, while freshmen students use more strategies that are related to the study preferences.

Keywords: vocabulary leaning strategies, medical sciences, students, linguistics

Procedia PDF Downloads 372
57 Atwood's Canadianisms and Neologisms: A Cognitive Approach to Literature

Authors: Eleonora Sasso

Abstract:

This paper takes as its starting point the notions of cognitive linguistics and lexical blending, and uses both these theoretical concepts to advance a new reading of Margaret Atwood’s latest writings, one which sees them as paramount literary examples of norm and usage in bilingual Canadian lexicography. Atwood’s prose seems to be imbued with Canadianisms and neologisms, lexical blends of zoomorphic forms, a kind of meeting-point between two conceptual structures which follow the principles of lexical economy and asyntactic relation. Atwood’s neologisms also attest to the undeniable impact on language exerted by Canada’s aboriginal peoples. This paper aims to track through these references and with the aid of the Eskimo-English dictionary look at the linguistic issues – attitudes to contaminations and hybridisations, questions of lexical blending in literary examples, etc – which they raise. Atwood’s fiction, whose cognitive linguistic strategy employs ‘the virtues of scissors and matches’, always strives to achieve isomorphism between word form and concept.

Keywords: Atwood, Canadianisms, cognitive science, Eskimo/English dictionary

Procedia PDF Downloads 191
56 Experimental Evaluation of Succinct Ternary Tree

Authors: Dmitriy Kuptsov

Abstract:

Tree data structures, such as binary or in general k-ary trees, are essential in computer science. The applications of these data structures can range from data search and retrieval to sorting and ranking algorithms. Naive implementations of these data structures can consume prohibitively large volumes of random access memory limiting their applicability in certain solutions. Thus, in these cases, more advanced representation of these data structures is essential. In this paper we present the design of the compact version of ternary tree data structure and demonstrate the results for the experimental evaluation using static dictionary problem. We compare these results with the results for binary and regular ternary trees. The conducted evaluation study shows that our design, in the best case, consumes up to 12 times less memory (for the dictionary used in our experimental evaluation) than a regular ternary tree and in certain configuration shows performance comparable to regular ternary trees. We have evaluated the performance of the algorithms using both 32 and 64 bit operating systems.

Keywords: algorithms, data structures, succinct ternary tree, per- formance evaluation

Procedia PDF Downloads 71
55 A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Authors: Marcio Leal, Marta Villamil

Abstract:

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

Keywords: artificial neural network, computer vision, dynamic time warping, infrared, sign language recognition

Procedia PDF Downloads 147
54 KSVD-SVM Approach for Spontaneous Facial Expression Recognition

Authors: Dawood Al Chanti, Alice Caplier

Abstract:

Sparse representations of signals have received a great deal of attention in recent years. In this paper, the interest of using sparse representation as a mean for performing sparse discriminative analysis between spontaneous facial expressions is demonstrated. An automatic facial expressions recognition system is presented. It uses a KSVD-SVM approach which is made of three main stages: A pre-processing and feature extraction stage, which solves the problem of shared subspace distribution based on the random projection theory, to obtain low dimensional discriminative and reconstructive features; A dictionary learning and sparse coding stage, which uses the KSVD model to learn discriminative under or over dictionaries for sparse coding; Finally a classification stage, which uses a SVM classifier for facial expressions recognition. Our main concern is to be able to recognize non-basic affective states and non-acted expressions. Extensive experiments on the JAFFE static acted facial expressions database but also on the DynEmo dynamic spontaneous facial expressions database exhibit very good recognition rates.

Keywords: dictionary learning, random projection, pose and spontaneous facial expression, sparse representation

Procedia PDF Downloads 219
53 Bag of Local Features for Person Re-Identification on Large-Scale Datasets

Authors: Yixiu Liu, Yunzhou Zhang, Jianning Chi, Hao Chu, Rui Zheng, Libo Sun, Guanghao Chen, Fangtong Zhou

Abstract:

In the last few years, large-scale person re-identification has attracted a lot of attention from video surveillance since it has a potential application prospect in public safety management. However, it is still a challenging job considering the variation in human pose, the changing illumination conditions and the lack of paired samples. Although the accuracy has been significantly improved, the data dependence of the sample training is serious. To tackle this problem, a new strategy is proposed based on bag of visual words (BoVW) model of designing the feature representation which has been widely used in the field of image retrieval. The local features are extracted, and more discriminative feature representation is obtained by cross-view dictionary learning (CDL), then the assignment map is obtained through k-means clustering. Finally, the BoVW histograms are formed which encodes the images with the statistics of the feature classes in the assignment map. Experiments conducted on the CUHK03, Market1501 and MARS datasets show that the proposed method performs favorably against existing approaches.

Keywords: bag of visual words, cross-view dictionary learning, person re-identification, reranking

Procedia PDF Downloads 104
52 A Method for the Extraction of the Character's Tendency from Korean Novels

Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.

Keywords: character tendency, data mining, emotion word, Korean novel

Procedia PDF Downloads 210
51 A Recognition Method for Spatio-Temporal Background in Korean Historical Novels

Authors: Seo-Hee Kim, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The most important elements of a novel are the characters, events and background. The background represents the time, place and situation that character appears, and conveys event and atmosphere more realistically. If readers have the proper knowledge about background of novels, it may be helpful for understanding the atmosphere of a novel and choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels because spatio-temporal background especially performs an important role in historical novels among the genre of Korean novels. To the best of our knowledge, we could not find previous study that was aimed at Korean novels. In this paper, we build a Korean historical national dictionary. Our dictionary has historical places and temple names of kings over many generations as well as currently existing spatial words or temporal words in Korean history. We also present a method for recognizing spatio-temporal background based on patterns of phrasal words in Korean sentences. Our rules utilize postposition for spatial background recognition and temple names for temporal background recognition. The knowledge of the recognized background can help readers to understand the flow of events and atmosphere, and can use to visualize the elements of novels.

Keywords: data mining, Korean historical novels, Korean linguistic feature, spatio-temporal background

Procedia PDF Downloads 208
50 Curriculum Check in Industrial Design, Based on Knowledge Management in Iran Universities

Authors: Maryam Mostafaee, Hassan Sadeghi Naeini, Sara Mostowfi

Abstract:

Today’s Knowledge management (KM), plays an important role in organizations. Basically, knowledge management is in the relation of using it for taking advantage of work forces in an organization for forwarding the goals and demand of that organization used at the most. The purpose of knowledge management is not only to manage existing documentation, information, and Data through an organization, but the most important part of KM is to control most important and key factor of those information and Data. For sure it is to chase the information needed for the employees in the right time of needed to take from genuine source for bringing out the best performance and result then in this matter the performance of organization will be at most of it. There are a lot of definitions over the objective of management released. Management is the science that in force the accurate knowledge with repeating to the organization to shape it and take full advantages for reaching goals and targets in the organization to be used by employees and users, but the definition of Knowledge based on Kalinz dictionary is: Facts, emotions or experiences known by man or group of people is ‘ knowledge ‘: Based on the Merriam Webster Dictionary: the act or skill of controlling and making decision about a business, department, sport team, etc, based on the Oxford Dictionary: Efficient handling of information and resources within a commercial organization, and based on the Oxford Dictionary: The art or process of designing manufactured products: the scale is a beautiful work of industrial design. When knowledge management performed executive in universities, discovery and create a new knowledge be facilitated. Make procedures between different units for knowledge exchange. College's officials and employees understand the importance of knowledge for University's success and will make more efforts to prevent the errors. In this strategy, is explored factors and affective trends and manage of it in University. In this research, Iranian universities for a time being analyzed that over usage of knowledge management, how they are behaving and having understood this matter: 1. Discovery of knowledge management in Iranian Universities, 2. Transferring exciting knowledge between faculties and unites, 3. Participate of employees for getting and using and transferring knowledge, 4.The accessibility of valid sources, 5. Researching over factors and correct processes in the university. We are pointing in some examples that we have already analyzed which is: -Enabling better and faster decision-making, -Making it easy to find relevant information and resources, -Reusing ideas, documents, and expertise, -Avoiding redundant effort. Consequence: It is found that effectiveness of knowledge management in the Industrial design field is low. Based on filled checklist by Education officials and professors in universities, and coefficient of effectiveness Calculate, knowledge management could not get the right place.

Keywords: knowledge management, industrial design, educational curriculum, learning performance

Procedia PDF Downloads 297
49 Technique and Use of Machine Readable Dictionary: In Special Reference to Hindi-Marathi Machine Translation

Authors: Milind Patil

Abstract:

Present paper is a discussion on Hindi-Marathi Morphological Analysis and generating rules for Machine Translation on the basis of Machine Readable Dictionary (MRD). This used Transformative Generative Grammar (TGG) rules to design the MRD. As per TGG rules, the suffix of a particular root word is based on its Tense, Aspect, Modality and Voice. That's why the suffix is very important for the word meanings (or root meanings). The Hindi and Marathi Language both have relation with Indo-Aryan language family. Both have been derived from Sanskrit language and their script is 'Devnagari'. But there are lots of differences in terms of semantics and grammatical level too. In Marathi, there are three genders, but in Hindi only two (Masculine and Feminine), the Natural gender is absent in Hindi. Likewise other grammatical categories also differ in their level of use. For MRD the suffixes (or Morpheme) are of particular root word for GNP (Gender, Number and Person) are based on its natural phenomena. A particular Suffix and Morphine change as per the need of person, number and gender. The design of MRD also based on this format. In first, Person, Number, Gender and Tense are key points than root words and suffix of particular Person, Number Gender (PNG). After that the inferences are drawn on the basis of rules that is (V.stem) (Pre.T/Past.T) (x) + (Aux-Pre.T) (x) → (V.Stem.) + (SP.TM) (X).

Keywords: MRD, TGG, stem, morph, morpheme, suffix, PNG, TAM&V, root

Procedia PDF Downloads 257
48 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder

Procedia PDF Downloads 223
47 The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level

Authors: Siti Ayu Ningsih

Abstract:

This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references.

Keywords: Indonesian language for foreign speaker, learning outcome, media, reading comprehension

Procedia PDF Downloads 139
46 Computerized Analysis of Phonological Structure of 10,400 Brazilian Sign Language Signs

Authors: Wanessa G. Oliveira, Fernando C. Capovilla

Abstract:

Capovilla and Raphael’s Libras Dictionary documents a corpus of 4,200 Brazilian Sign Language (Libras) signs. Duduchi and Capovilla’s software SignTracking permits users to retrieve signs even when ignoring the gloss corresponding to it and to discover the meaning of all 4,200 signs sign simply by clicking on graphic menus of the sign characteristics (phonemes). Duduchi and Capovilla have discovered that the ease with which any given sign can be retrieved is an inverse function of the average popularity of its component phonemes. Thus, signs composed of rare (distinct) phonemes are easier to retrieve than are those composed of common phonemes. SignTracking offers a means of computing the average popularity of the phonemes that make up each one of 4,200 signs. It provides a precise measure of the degree of ease with which signs can be retrieved, and sign meanings can be discovered. Duduchi and Capovilla’s logarithmic model proved valid: The degree with which any given sign can be retrieved is an inverse function of the arithmetic mean of the logarithm of the popularity of each component phoneme. Capovilla, Raphael and Mauricio’s New Libras Dictionary documents a corpus of 10,400 Libras signs. The present analysis revealed Libras DNA structure by mapping the incidence of 501 sign phonemes resulting from the layered distribution of five parameters: 163 handshape phonemes (CherEmes-ManusIculi); 34 finger shape phonemes (DactilEmes-DigitumIculi); 55 hand placement phonemes (ArtrotoToposEmes-ArticulatiLocusIculi); 173 movement dimension phonemes (CinesEmes-MotusIculi) pertaining to direction, frequency, and type; and 76 Facial Expression phonemes (MascarEmes-PersonalIculi).

Keywords: Brazilian sign language, lexical retrieval, libras sign, sign phonology

Procedia PDF Downloads 101
45 Methodological Proposal, Archival Thesaurus in Colombian Sign Language

Authors: Pedro A. Medina-Rios, Marly Yolie Quintana-Daza

Abstract:

Having the opportunity to communicate in a social, academic and work context is very relevant for any individual and more for a deaf person when oral language is not their natural language, and written language is their second language. Currently, in Colombia, there is not a specialized dictionary for our best knowledge in sign language archiving. Archival is one of the areas that the deaf community has a greater chance of performing. Nourishing new signs in dictionaries for deaf people extends the possibility that they have the appropriate signs to communicate and improve their performance. The aim of this work was to illustrate the importance of designing pedagogical and technological strategies of knowledge management, for the academic inclusion of deaf people through proposals of lexicon in Colombian sign language (LSC) in the area of archival. As a method, the analytical study was used to identify relevant words in the technical area of the archival and its counterpart with the LSC, 30 deaf people, apprentices - students of the Servicio Nacional de Aprendizaje (SENA) in Documentary or Archival Management programs, were evaluated through direct interviews in LSC. For the analysis tools were maintained to evaluate correlation patterns and linguistic methods of visual, gestural analysis and corpus; besides, methods of linear regression were used. Among the results, significant data were found among the variables socioeconomic stratum, academic level, labor location. The need to generate new signals on the subject of the file to improve communication between the deaf person, listener and the sign language interpreter. It is concluded that the generation of new signs to nourish the LSC dictionary in archival subjects is necessary to improve the labor inclusion of deaf people in Colombia.

Keywords: archival, inclusion, deaf, thesaurus

Procedia PDF Downloads 204
44 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, opinion detection, SentiWordNet, trust score

Procedia PDF Downloads 118
43 Atomic Decomposition Audio Data Compression and Denoising Using Sparse Dictionary Feature Learning

Authors: T. Bryan , V. Kepuska, I. Kostnaic

Abstract:

A method of data compression and denoising is introduced that is based on atomic decomposition of audio data using “basis vectors” that are learned from the audio data itself. The basis vectors are shown to have higher data compression and better signal-to-noise enhancement than the Gabor and gammatone “seed atoms” that were used to generate them. The basis vectors are the input weights of a Sparse AutoEncoder (SAE) that is trained using “envelope samples” of windowed segments of the audio data. The envelope samples are extracted from the audio data by performing atomic decomposition with Gabor or gammatone seed atoms. This process identifies segments of audio data that are locally coherent with the seed atoms. Envelope samples are extracted by identifying locally coherent audio data segments with Gabor or gammatone seed atoms, found by matching pursuit. The envelope samples are formed by taking the kronecker products of the atomic envelopes with the locally coherent data segments. Oracle signal-to-noise ratio (SNR) verses data compression curves are generated for the seed atoms as well as the basis vectors learned from Gabor and gammatone seed atoms. SNR data compression curves are generated for speech signals as well as early American music recordings. The basis vectors are shown to have higher denoising capability for data compression rates ranging from 90% to 99.84% for speech as well as music. Envelope samples are displayed as images by folding the time series into column vectors. This display method is used to compare of the output of the SAE with the envelope samples that produced them. The basis vectors are also displayed as images. Sparsity is shown to play an important role in producing the highest denoising basis vectors.

Keywords: sparse dictionary learning, autoencoder, sparse autoencoder, basis vectors, atomic decomposition, envelope sampling, envelope samples, Gabor, gammatone, matching pursuit

Procedia PDF Downloads 186