Search results for: multilingual dictionary
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 58

Search results for: multilingual dictionary

58 A Dictionary Learning Method Based On EMD for Audio Sparse Representation

Authors: Yueming Wang, Zenghui Zhang, Rendong Ying, Peilin Liu

Abstract:

Sparse representation has long been studied and several dictionary learning methods have been proposed. The dictionary learning methods are widely used because they are adaptive. In this paper, a new dictionary learning method for audio is proposed. Signals are at first decomposed into different degrees of Intrinsic Mode Functions (IMF) using Empirical Mode Decomposition (EMD) technique. Then these IMFs form a learned dictionary. To reduce the size of the dictionary, the K-means method is applied to the dictionary to generate a K-EMD dictionary. Compared to K-SVD algorithm, the K-EMD dictionary decomposes audio signals into structured components, thus the sparsity of the representation is increased by 34.4% and the SNR of the recovered audio signals is increased by 20.9%.

Keywords: Dictionary Learning, EMD, K-means Method, Sparse Representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2583
57 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System

Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur

Abstract:

Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.

Keywords: Avatar, dictionary, HamNoSys, hearing-impaired, Indian Sign Language, sign language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263
56 Particular Features of the First Romanian Multilingual Dictionaries

Authors: Mihaela Mocanu

Abstract:

The Romanian multilingual dictionaries – also named polyglot, plurilingual or polylingual dictionaries, have known a slow yet constant development starting with the end of the 17th century, when the first such work is attested, to the present time, when we witness a considerable increase of the number of polyglot dictionaries, especially the terminological ones. This paper aims at analyzing the context in which the first Romanian multilingual dictionaries were issued, as well as and the organization and structure particularities of the first lexicographic works of this type. The irretrievable loss of some of these works as well as the partial conservation of others renders the attempt to retrace the beginnings of Romanian lexicography extremely difficult. The research methodology is part of a descriptive and analytical approach based on two types of sources, subject to contrastive analysis: the notes made by the initiators of lexicographic projects and the testimonies of their contemporaries, respectively, along with the specialized studies regarding the history of the old Romanian lexicography. The analysis of the contents has indicated that these dictionaries lacked a scientific apparatus in the true sense of the phrase, failed to obey unitary organizational criteria, being limited, most of the times, to mere inventories of words, where the Romanian term was assigned its correspondent in other languages. Motivated by practical reasons, the first multilingual dictionaries were aimed at the clerics their purpose being to ensure the translators’ fidelity towards the original religious texts, regarded as sacred.

Keywords: Language, multilingual dictionary, Romanian lexicography, terminology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1365
55 A Fast HRRP Synthesis Algorithm with Sensing Dictionary in GTD Model

Authors: R. Fan, Q. Wan, H. Chen, Y.L. Liu, Y.P. Liu

Abstract:

In the paper, a fast high-resolution range profile synthetic algorithm called orthogonal matching pursuit with sensing dictionary (OMP-SD) is proposed. It formulates the traditional HRRP synthetic to be a sparse approximation problem over redundant dictionary. As it employs a priori that the synthetic range profile (SRP) of targets are sparse, SRP can be accomplished even in presence of data lost. Besides, the computation complexity decreases from O(MNDK) flops for OMP to O(M(N + D)K) flops for OMP-SD by introducing sensing dictionary (SD). Simulation experiments illustrate its advantages both in additive white Gaussian noise (AWGN) and noiseless situation, respectively.

Keywords: GTD-based model, HRRP, orthogonal matching pursuit, sensing dictionary.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1873
54 From Mother Tongue Education to Multilingual Higher Education

Authors: Mario R. Acevedo Amaya, Fernanda M. Martinez Reyes

Abstract:

Through the time, the higher education has changed the learning system since mother tongue to bilingual, and in this new century has been coming develop a multilingual education. All as part of globalization process of the countries and the education. Nevertheless, this change only has been effectively in countries of the first world, the rest have been lagging. Therefore, these countries require strengthen their higher education systems through models that give way to multilingual and bilingual education. In this way, shows a new model adapted from a systemic form to allow a higher bilingual and multilingual education in Latin America. This systematization aims to increase the skills and competencies student’s, decrease the time learning of a second tongue, add to multilingualism in the American Latin Universities, also, contribute to position the region´s countries in a better global status, and stimulate the development of new research in this area.

Keywords: Bilingual Education, Higher Education, Multilingual Education, Multilingual Education Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1881
53 Two Undetectable On-line Dictionary Attacks on Debiao et al.’s S-3PAKE Protocol

Authors: Sung-Bae Choi, Sang-Yoon Yoon, Eun-Jun Yoon

Abstract:

In 2011, Debiao et al. pointed out that S-3PAKE protocol proposed by Lu and Cao for password-authenticated key exchange in the three-party setting is vulnerable to an off-line dictionary attack. Then, they proposed some countermeasures to eliminate the security vulnerability of the S-3PAKE. Nevertheless, this paper points out their enhanced S-3PAKE protocol is still vulnerable to undetectable on-line dictionary attacks unlike their claim.

Keywords: Authentication, 3PAKE, password, three-party key exchange, network security, dictionary attacks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
52 Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay

Authors: E. S. Gower, M. O. J. Hawksford

Abstract:

An algorithm for learning an overcomplete dictionary using a Cauchy mixture model for sparse decomposition of an underdetermined mixing system is introduced. The mixture density function is derived from a ratio sample of the observed mixture signals where 1) there are at least two but not necessarily more mixture signals observed, 2) the source signals are statistically independent and 3) the sources are sparse. The basis vectors of the dictionary are learned via the optimization of the location parameters of the Cauchy mixture components, which is shown to be more accurate and robust than the conventional data mining methods usually employed for this task. Using a well known sparse decomposition algorithm, we extract three speech signals from two mixtures based on the estimated dictionary. Further tests with additive Gaussian noise are used to demonstrate the proposed algorithm-s robustness to outliers.

Keywords: expectation-maximization, Pitman estimator, sparsedecomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1907
51 Cryptanalysis of Yang-Li-Liao’s Simple Three-Party Key Exchange (S-3PAKE) Protocol

Authors: Hae-Soon Ahn, Eun-Jun Yoon

Abstract:

Three-party password authenticated key exchange (3PAKE) protocols are widely deployed on lots of remote user authentication system due to its simplicity and convenience of maintaining a human-memorable password at client side to achieve secure communication within a hostile network. Recently, an improvement of 3PAKE protocol by processing a built-in data attached to other party for identity authentication to individual data was proposed by some researchers. However, this paper points out that the improved 3PAKE protocol is still vulnerable to undetectable on-line dictionary attack and off-line dictionary attack.

Keywords: Three-party key exchange, 3PAKE, Passwordauthenticated key exchange, Network security, Dictionary attack

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2083
50 Sparse Coding Based Classification of Electrocardiography Signals Using Data-Driven Complete Dictionary Learning

Authors: Fuad Noman, Sh-Hussain Salleh, Chee-Ming Ting, Hadri Hussain, Syed Rasul

Abstract:

In this paper, a data-driven dictionary approach is proposed for the automatic detection and classification of cardiovascular abnormalities. Electrocardiography (ECG) signal is represented by the trained complete dictionaries that contain prototypes or atoms to avoid the limitations of pre-defined dictionaries. The data-driven trained dictionaries simply take the ECG signal as input rather than extracting features to study the set of parameters that yield the most descriptive dictionary. The approach inherently learns the complicated morphological changes in ECG waveform, which is then used to improve the classification. The classification performance was evaluated with ECG data under two different preprocessing environments. In the first category, QT-database is baseline drift corrected with notch filter and it filters the 60 Hz power line noise. In the second category, the data are further filtered using fast moving average smoother. The experimental results on QT database confirm that our proposed algorithm shows a classification accuracy of 92%.

Keywords: Electrocardiogram, dictionary learning, sparse coding, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2033
49 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606
48 An Image Segmentation Algorithm for Gradient Target Based on Mean-Shift and Dictionary Learning

Authors: Yanwen Li, Shuguo Xie

Abstract:

In electromagnetic imaging, because of the diffraction limited system, the pixel values could change slowly near the edge of the image targets and they also change with the location in the same target. Using traditional digital image segmentation methods to segment electromagnetic gradient images could result in lots of errors because of this change in pixel values. To address this issue, this paper proposes a novel image segmentation and extraction algorithm based on Mean-Shift and dictionary learning. Firstly, the preliminary segmentation results from adaptive bandwidth Mean-Shift algorithm are expanded, merged and extracted. Then the overlap rate of the extracted image block is detected before determining a segmentation region with a single complete target. Last, the gradient edge of the extracted targets is recovered and reconstructed by using a dictionary-learning algorithm, while the final segmentation results are obtained which are very close to the gradient target in the original image. Both the experimental results and the simulated results show that the segmentation results are very accurate. The Dice coefficients are improved by 70% to 80% compared with the Mean-Shift only method.

Keywords: Gradient image, segmentation and extract, mean-shift algorithm, dictionary learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 930
47 Development of Circulating Support Environment of Multilingual Medical Communication using Parallel Texts for Foreign Patients

Authors: Mai Miyabe, Taku Fukushima, Takashi Yoshino, Aguri Shigeno

Abstract:

The need for multilingual communication in Japan has increased due to an increase in the number of foreigners in the country. When people communicate in their nonnative language, the differences in language prevent mutual understanding among the communicating individuals. In the medical field, communication between the hospital staff and patients is a serious problem. Currently, medical translators accompany patients to medical care facilities, and the demand for medical translators is increasing. However, medical translators cannot necessarily provide support, especially in cases in which round-the-clock support is required or in case of emergencies. The medical field has high expectations from information technology. Hence, a system that supports accurate multilingual communication is required. Despite recent advances in machine translation technology, it is very difficult to obtain highly accurate translations. We have developed a support system called M3 for multilingual medical reception. M3 provides support functions that aid foreign patients in the following respects: conversation, questionnaires, reception procedures, and hospital navigation; it also has a Q&A function. Users can operate M3 using a touch screen and receive text-based support. In addition, M3 uses accurate translation tools called parallel texts to facilitate reliable communication through conversations between the hospital staff and the patients. However, if there is no parallel text that expresses what users want to communicate, the users cannot communicate. In this study, we have developed a circulating support environment for multilingual medical communication using parallel texts. The proposed environment can circulate necessary parallel texts through the following procedure: (1) a user provides feedback about the necessary parallel texts, following which (2) these parallel texts are created and evaluated.

Keywords: multilingual medical communication, parallel texts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1444
46 An Improvement of PDLZW implementation with a Modified WSC Updating Technique on FPGA

Authors: Perapong Vichitkraivin, Orachat Chitsobhuk

Abstract:

In this paper, an improvement of PDLZW implementation with a new dictionary updating technique is proposed. A unique dictionary is partitioned into hierarchical variable word-width dictionaries. This allows us to search through dictionaries in parallel. Moreover, the barrel shifter is adopted for loading a new input string into the shift register in order to achieve a faster speed. However, the original PDLZW uses a simple FIFO update strategy, which is not efficient. Therefore, a new window based updating technique is implemented to better classify the difference in how often each particular address in the window is referred. The freezing policy is applied to the address most often referred, which would not be updated until all the other addresses in the window have the same priority. This guarantees that the more often referred addresses would not be updated until their time comes. This updating policy leads to an improvement on the compression efficiency of the proposed algorithm while still keep the architecture low complexity and easy to implement.

Keywords: lossless data compression, LZW algorithm, PDLZW algorithm, WSC and dictionary update.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1585
45 A Collaborative Platform for Multilingual Ontology Development

Authors: Ahmed Tawfik, Fausto Giunchiglia, Vincenzo Maltese

Abstract:

Ontologies provide a common understanding of a specific domain of interest that can be communicated between people and used as background knowledge for automated reasoning in a wide range of applications. In this paper, we address the design of multilingual ontologies following well-defined knowledge engineering methodologies with the support of novel collaborative development approaches. In particular, we present a collaborative platform which allows ontologies to be developed incrementally in multiple languages. This is made possible via an appropriate mapping between language independent concepts and one lexicalization per language (or a lexical gap in case such lexicalization does not exist). The collaborative platform has been designed to support the development of the Universal Knowledge Core, a multilingual ontology currently in English, Italian, Chinese, Mongolian, Hindi and Bangladeshi. Its design follows a workflow-based development methodology that models resources as a set of collaborative objects and assigns customizable workflows to build and maintain each collaborative object in a community driven manner, with extensive support of modern web 2.0 social and collaborative features.

Keywords: Knowledge Diversity, Knowledge Representation, Ontology Development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2173
44 Iterative Image Reconstruction for Sparse-View Computed Tomography via Total Variation Regularization and Dictionary Learning

Authors: XianYu Zhao, JinXu Guo

Abstract:

Recently, low-dose computed tomography (CT) has become highly desirable due to increasing attention to the potential risks of excessive radiation. For low-dose CT imaging, ensuring image quality while reducing radiation dose is a major challenge. To facilitate low-dose CT imaging, we propose an improved statistical iterative reconstruction scheme based on the Penalized Weighted Least Squares (PWLS) standard combined with total variation (TV) minimization and sparse dictionary learning (DL) to improve reconstruction performance. We call this method "PWLS-TV-DL". In order to evaluate the PWLS-TV-DL method, we performed experiments on digital phantoms and physical phantoms, respectively. The experimental results show that our method is in image quality and calculation. The efficiency is superior to other methods, which confirms the potential of its low-dose CT imaging.

Keywords: Low dose computed tomography, penalized weighted least squares, total variation, dictionary learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 785
43 Lexical Database for Multiple Languages: Multilingual Word Semantic Network

Authors: K. K. Yong, R. Mahmud, C. S. Woo

Abstract:

Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators and corpora are usually limited to certain languages and domains. Furthermore, search results from engines with traditional 'keyword' approach are no longer satisfying. More intelligent knowledge engineering agents are needed. To address to these problems, a system known as Multilingual Word Semantic Network is proposed. This system adapted semantic network to organize words according to concepts and relations. The system also uses open source as the development philosophy to enable the native language speakers and experts to contribute their knowledge to the system. The contributed words are then defined and linked using lexical and semantic relations. Thus, related words and derivatives can be identified and linked. From the outcome of the system implementation, it contributes to the development of semantic web and knowledge engineering.

Keywords: Multilingual, semantic network, intelligent knowledge engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1917
42 How Efficiency of Password Attack Based on a Keyboard

Authors: Hsien-cheng Chou, Fei-pei Lai, Hung-chang Lee

Abstract:

At present, dictionary attack has been the basic tool for recovering key passwords. In order to avoid dictionary attack, users purposely choose another character strings as passwords. According to statistics, about 14% of users choose keys on a keyboard (Kkey, for short) as passwords. This paper develops a framework system to attack the password chosen from Kkeys and analyzes its efficiency. Within this system, we build up keyboard rules using the adjacent and parallel relationship among Kkeys and then use these Kkey rules to generate password databases by depth-first search method. According to the experiment results, we find the key space of databases derived from these Kkey rules that could be far smaller than the password databases generated within brute-force attack, thus effectively narrowing down the scope of attack research. Taking one general Kkey rule, the combinations in all printable characters (94 types) with Kkey adjacent and parallel relationship, as an example, the derived key space is about 240 smaller than those in brute-force attack. In addition, we demonstrate the method's practicality and value by successfully cracking the access password to UNIX and PC using the password databases created

Keywords: Brute-force attack, dictionary attack, depth-firstsearch, password attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3442
41 A Multilingual Virtual Simulated Patient Framework for Training Primary Health Care Students

Authors: Juan L. Castro, Maria I. NavarroVictor Lopez, Eduardo M. Eisman, Jose M. Zurita

Abstract:

This paper describes the Multilingual Virtual Simulated Patient framework. It has been created to train the social skills and testing the knowledge of primary health care medical students. The framework generates conversational agents which perform in serveral languages as virtual simulated patients that help to improve the communication and diagnosis skills of the students complementing their training process.

Keywords: Medical training, conversational agents, patient modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
40 Inferring Hierarchical Pronunciation Rules from a Phonetic Dictionary

Authors: Erika Pigliapoco, Valerio Freschi, Alessandro Bogliolo

Abstract:

This work presents a new phonetic transcription system based on a tree of hierarchical pronunciation rules expressed as context-specific grapheme-phoneme correspondences. The tree is automatically inferred from a phonetic dictionary by incrementally analyzing deeper context levels, eventually representing a minimum set of exhaustive rules that pronounce without errors all the words in the training dictionary and that can be applied to out-of-vocabulary words. The proposed approach improves upon existing rule-tree-based techniques in that it makes use of graphemes, rather than letters, as elementary orthographic units. A new linear algorithm for the segmentation of a word in graphemes is introduced to enable outof- vocabulary grapheme-based phonetic transcription. Exhaustive rule trees provide a canonical representation of the pronunciation rules of a language that can be used not only to pronounce out-of-vocabulary words, but also to analyze and compare the pronunciation rules inferred from different dictionaries. The proposed approach has been implemented in C and tested on Oxford British English and Basic English. Experimental results show that grapheme-based rule trees represent phonetically sound rules and provide better performance than letter-based rule trees.

Keywords: Automatic phonetic transcription, pronunciation rules, hierarchical tree inference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1885
39 Test Data Compression Using a Hybrid of Bitmask Dictionary and 2n Pattern Runlength Coding Methods

Authors: C. Kalamani, K. Paramasivam

Abstract:

In VLSI, testing plays an important role. Major problem in testing are test data volume and test power. The important solution to reduce test data volume and test time is test data compression. The Proposed technique combines the bit maskdictionary and 2n pattern run length-coding method and provides a substantial improvement in the compression efficiency without introducing any additional decompression penalty. This method has been implemented using Mat lab and HDL Language to reduce test data volume and memory requirements. This method is applied on various benchmark test sets and compared the results with other existing methods. The proposed technique can achieve a compression ratio up to 86%.

Keywords: Bit Mask dictionary, 2n pattern run length code, system-on-chip, SOC, test data compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
38 Signed Approach for Mining Web Content Outliers

Authors: G. Poonkuzhali, K.Thiagarajan, K.Sarukesi, G.V.Uma

Abstract:

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

Keywords: Outliers, Relevant document, , Signed Approach, Web content mining, Web documents..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2312
37 A Case Study on Vocational Teachers’ Perceptions on Their Linguistically and Culturally Responsive Teaching

Authors: Kirsi Korkealehto

Abstract:

In Finland the transformation from homogenous culture into multicultural one as a result of heavy immigration has been rapid in the recent decades. As multilingualism and multiculturalism are growing features in our society, teachers in all educational levels need to be competent for encounters with students from diverse cultural backgrounds. Consequently, also the number of multicultural and multilingual vocational school students has increased which has not been taken into consideration in teacher education enough. To bridge this gap between teachers’ competences and the requirements of the contemporary school world, Finnish Ministry of Culture and Education established the DivEd-project. The aim of the project is to prepare all teachers to work in the linguistically and culturally diverse world they live in, to develop and increase culturally sustaining and linguistically responsive pedagogy in Finland, increase awareness among Teacher Educators working with preservice teachers and to increase awareness and provide specific strategies to in-service teachers. The partners in the nationwide project are 6 universities and 2 universities of applied sciences. In this research, the linguistically and culturally sustainable teaching practices developed within the DivEd-project are tested in practice. This research aims to explore vocational teachers’ perceptions of these multilingualism and multilingual educational practices. The participants of this study are vocational teachers in of different fields. The data were collected by individual, face-to-face interviews. The data analysis was conducted through content analysis. The findings indicate that the vocational teachers experience that they lack knowledge on linguistically and culturally responsive pedagogy. Moreover, they regard themselves in some extent incompetent in incorporating multilingually and multiculturally sustainable pedagogy in everyday teaching work. Therefore, they feel they need more training pertaining multicultural and multilingual knowledge, competences and suitable pedagogical methods for teaching students from diverse linguistic and cultural backgrounds.

Keywords: Multicultural, multilingual, teacher competences, vocational school.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 430
36 Hospitality Management to Welcome Foreign Guests in the Japanese Lodging Industry

Authors: Shunichiro Morishita

Abstract:

This study examines the factors for attracting foreign guests in the Japanese lodging industry and discusses some measures taken for accepting foreign guests. It reviews three different accommodation providers acclaimed highly by foreign guests, Yamashiroya, Sawanoya and Fuji-Hakone Guest House, and identifies their characteristics. The common points for attracting foreign guests were: 1) making the best use of the old facilities, 2) multilingual signs, guidance and websites, 3) necessary and sufficient communication in English, 4) events and opportunities to experience Japanese culture, 5) omotenashi, warm and homely Japanese hospitality. These findings indicate that foreign guests’ dissatisfaction level can be decreased through internationalization utilizing ICT and by offering multilingual support. On the other hand, their satisfaction level can be increased by encouraging interaction with other guests and local Japanese people, providing events and opportunities to experience Japanese culture and omotenashi, home-style Japanese hospitality.

Keywords: Hospitality management, foreign guests, Japanese lodging industry, Omotenashi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 888
35 A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Authors: Marcio Leal, Marta Villamil

Abstract:

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

Keywords: Sign language recognition, computer vision, infrared, artificial neural network, dynamic time warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 826
34 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1697
33 A Recognition Method for Spatio-Temporal Background in Korean Historical Novels

Authors: Seo-Hee Kim, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The most important elements of a novel are the characters, events and background. The background represents the time, place and situation that character appears, and conveys event and atmosphere more realistically. If readers have the proper knowledge about background of novels, it may be helpful for understanding the atmosphere of a novel and choosing a novel that readers want to read. In this paper, we are targeting Korean historical novels because spatio-temporal background especially performs an important role in historical novels among the genre of Korean novels. To the best of our knowledge, we could not find previous study that was aimed at Korean novels. In this paper, we build a Korean historical national dictionary. Our dictionary has historical places and temple names of kings over many generations as well as currently existing spatial words or temporal words in Korean history. We also present a method for recognizing spatio-temporal background based on patterns of phrasal words in Korean sentences. Our rules utilize postposition for spatial background recognition and temple names for temporal background recognition. The knowledge of the recognized background can help readers to understand the flow of events and atmosphere, and can use to visualize the elements of novels.

Keywords: Data mining, Korean historical novels, Korean linguistic feature, spatio-temporal background.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1089
32 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: Time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522
31 Lexical Based Method for Opinion Detection on Tripadvisor Collection

Authors: Faiza Belbachir, Thibault Schienhinski

Abstract:

The massive development of online social networks allows users to post and share their opinions on various topics. With this huge volume of opinion, it is interesting to extract and interpret these information for different domains, e.g., product and service benchmarking, politic, system of recommendation. This is why opinion detection is one of the most important research tasks. It consists on differentiating between opinion data and factual data. The difficulty of this task is to determine an approach which returns opinionated document. Generally, there are two approaches used for opinion detection i.e. Lexical based approaches and Machine Learning based approaches. In Lexical based approaches, a dictionary of sentimental words is used, words are associated with weights. The opinion score of document is derived by the occurrence of words from this dictionary. In Machine learning approaches, usually a classifier is trained using a set of annotated document containing sentiment, and features such as n-grams of words, part-of-speech tags, and logical forms. Majority of these works are based on documents text to determine opinion score but dont take into account if these texts are really correct. Thus, it is interesting to exploit other information to improve opinion detection. In our work, we will develop a new way to consider the opinion score. We introduce the notion of trust score. We determine opinionated documents but also if these opinions are really trustable information in relation with topics. For that we use lexical SentiWordNet to calculate opinion and trust scores, we compute different features about users like (numbers of their comments, numbers of their useful comments, Average useful review). After that, we combine opinion score and trust score to obtain a final score. We applied our method to detect trust opinions in TRIPADVISOR collection. Our experimental results report that the combination between opinion score and trust score improves opinion detection.

Keywords: Tripadvisor, Opinion detection, SentiWordNet, trust score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696
30 Language Politics and Identity in Translation: From a Monolingual Text to Multilingual Text in Chinese Translations

Authors: Chu-Ching Hsu

Abstract:

This paper focuses on how the government-led language policies and the political changes in Taiwan manipulate the languages choice in translations and what translation strategies are employed by the translator to show his or her language ideology behind the power struggles and decision-making. Therefore, framed by Lefevere’s theoretical concept of translating as rewriting, and carried out a diachronic and chronological study, this paper specifically sets out to investigate the language ideology and translator’s idiolect of Chinese language translations of Anglo-American novels. The examples drawn to explore these issues were taken from different versions of Chinese renditions of Mark Twain’s English-language novel The Adventures of Huckleberry Finn in which there are several different dialogues originally written in the colloquial language and dialect used in the American state of Mississippi and reproduced in Mark Twain’s works. Also, adapted corpus methodology, many examples are extracted as instances from the translated texts and source text, to illuminate how the translators in Taiwan deal with the dialectal features encoded in Twain’s works, and how different versions of Chinese translations are employed by Taiwanese translators to confirm the language polices and to express their language identity textually in different periods of the past five decades, from the 1960s onward. The finding of this study suggests that the use of Taiwanese dialect and language patterns in translations does relate to the movement of the mother-tongue language and language ideology of the translator as well as to the issue of language identity raised in the island of Taiwan. Furthermore, this study confirms that the change of political power in Taiwan does bring significantly impact in language policy-- assimilationism, pluralism or multiculturalism, which also makes Taiwan from a monolingual to multilingual society, where the language ideology and identity can be revealed not only in people’s daily communication but also in written translations.

Keywords: Language politics and policies, literary translation, mother-tongue, multiculturalism, translator’s ideology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1065
29 Bi-lingual Handwritten Character and Numeral Recognition using Multi-Dimensional Recurrent Neural Networks (MDRNN)

Authors: Kandarpa Kumar Sarma

Abstract:

The key to the continued success of ANN depends, considerably, on the use of hybrid structures implemented on cooperative frame-works. Hybrid architectures provide the ability to the ANN to validate heterogeneous learning paradigms. This work describes the implementation of a set of Distributed and Hybrid ANN models for Character Recognition applied to Anglo-Assamese scripts. The objective is to describe the effectiveness of Hybrid ANN setups as innovative means of neural learning for an application like multilingual handwritten character and numeral recognition.

Keywords: Assamese, Feature, Recurrent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490